[jira] [Assigned] (FLINK-33211) Implement table lineage graph

2024-03-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33211:
-

Assignee: Zhenqiu Huang  (was: Peter Huang)

> Implement table lineage graph
> -
>
> Key: FLINK-33211
> URL: https://issues.apache.org/jira/browse/FLINK-33211
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Zhenqiu Huang
>Priority: Major
>
> Implement table lineage graph



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33211) Implement table lineage graph

2024-03-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33211:
-

Assignee: Peter Huang

> Implement table lineage graph
> -
>
> Key: FLINK-33211
> URL: https://issues.apache.org/jira/browse/FLINK-33211
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Peter Huang
>Priority: Major
>
> Implement table lineage graph



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33210) Introduce lineage graph relevant interfaces

2024-03-05 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-33210.
-
Fix Version/s: 1.20.0
   Resolution: Resolved

Closed by 6b9c282b5090720a02b941586cb7f5389691dba9

> Introduce lineage graph relevant interfaces 
> 
>
> Key: FLINK-33210
> URL: https://issues.apache.org/jira/browse/FLINK-33210
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> Introduce LineageGraph, LineageVertex and LineageEdge interfaces



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34460) Jdbc driver get rid of flink-core

2024-02-21 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-34460:
--
Description: Currently jdbc driver depends on flink-core/flink-runtime 
modules, users need to upgrade jdbc driver version when the flink session 
cluster for olap is upgraded, this is not very suitable.  (was: Currently jdbc 
driver depends on flink-core/flink-runtime module, users needs to upgrade jdbc 
driver version when the flink session cluster for olap is upgraded, this is not 
very suitable.)

> Jdbc driver get rid of flink-core
> -
>
> Key: FLINK-34460
> URL: https://issues.apache.org/jira/browse/FLINK-34460
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.20.0
>Reporter: Fang Yong
>Priority: Major
>
> Currently jdbc driver depends on flink-core/flink-runtime modules, users need 
> to upgrade jdbc driver version when the flink session cluster for olap is 
> upgraded, this is not very suitable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34466) Implement Lineage Interface in Kafka Connector

2024-02-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-34466:
-

Assignee: Zhenqiu Huang

> Implement Lineage Interface in Kafka Connector
> --
>
> Key: FLINK-34466
> URL: https://issues.apache.org/jira/browse/FLINK-34466
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Kafka
>Affects Versions: 1.19.0
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34468) Implement Lineage Interface in Cassandra Connector

2024-02-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-34468:
-

Assignee: Zhenqiu Huang

> Implement Lineage Interface in Cassandra Connector
> --
>
> Key: FLINK-34468
> URL: https://issues.apache.org/jira/browse/FLINK-34468
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Cassandra
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34467) Implement Lineage Interface in Jdbc Connector

2024-02-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-34467:
-

Assignee: Zhenqiu Huang

> Implement Lineage Interface in Jdbc Connector
> -
>
> Key: FLINK-34467
> URL: https://issues.apache.org/jira/browse/FLINK-34467
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / JDBC
>Affects Versions: 1.19.0
>Reporter: Zhenqiu Huang
>Assignee: Zhenqiu Huang
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34460) Jdbc driver get rid of flink-core

2024-02-18 Thread Fang Yong (Jira)
Fang Yong created FLINK-34460:
-

 Summary: Jdbc driver get rid of flink-core
 Key: FLINK-34460
 URL: https://issues.apache.org/jira/browse/FLINK-34460
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Client
Affects Versions: 1.20.0
Reporter: Fang Yong


Currently jdbc driver depends on flink-core/flink-runtime module, users needs 
to upgrade jdbc driver version when the flink session cluster for olap is 
upgraded, this is not very suitable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl

2024-02-07 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815486#comment-17815486
 ] 

Fang Yong commented on FLINK-34239:
---

[~mallikarjuna] DONE

> Introduce a deep copy method of SerializerConfig for merging with Table 
> configs in org.apache.flink.table.catalog.DataTypeFactoryImpl 
> --
>
> Key: FLINK-34239
> URL: https://issues.apache.org/jira/browse/FLINK-34239
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core
>Affects Versions: 1.19.0
>Reporter: Zhanghao Chen
>Assignee: Kumar Mallikarjuna
>Priority: Major
>
> *Problem*
> Currently, 
> org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig
>  will create a deep-copy of the SerializerConfig and merge Table config into 
> it. However, the deep copy is done by manully calling the getter and setter 
> methods of SerializerConfig, and is prone to human errors, e.g. missing 
> copying a newly added field in SerializerConfig.
> *Proposal*
> Introduce a deep copy method for SerializerConfig and replace the curr impl 
> in 
> org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl

2024-02-07 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-34239:
-

Assignee: Kumar Mallikarjuna

> Introduce a deep copy method of SerializerConfig for merging with Table 
> configs in org.apache.flink.table.catalog.DataTypeFactoryImpl 
> --
>
> Key: FLINK-34239
> URL: https://issues.apache.org/jira/browse/FLINK-34239
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core
>Affects Versions: 1.19.0
>Reporter: Zhanghao Chen
>Assignee: Kumar Mallikarjuna
>Priority: Major
>
> *Problem*
> Currently, 
> org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig
>  will create a deep-copy of the SerializerConfig and merge Table config into 
> it. However, the deep copy is done by manully calling the getter and setter 
> methods of SerializerConfig, and is prone to human errors, e.g. missing 
> copying a newly added field in SerializerConfig.
> *Proposal*
> Introduce a deep copy method for SerializerConfig and replace the curr impl 
> in 
> org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-34402) Class loading conflicts when using PowerMock in ITcase.

2024-02-06 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-34402:
-

Assignee: yisha zhou

>  Class loading conflicts when using PowerMock in ITcase.
> 
>
> Key: FLINK-34402
> URL: https://issues.apache.org/jira/browse/FLINK-34402
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.19.0
>Reporter: yisha zhou
>Assignee: yisha zhou
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently when no user jars exist, system classLoader will be used to load 
> classes as default. However, if we use powerMock to create some ITCases, the 
> framework will utilize JavassistMockClassLoader to load classes.  Forcing the 
> use of the system classLoader can lead to class loading conflict issue.
> Therefore we should use Thread.currentThread().getContextClassLoader() 
> instead of 
> ClassLoader.getSystemClassLoader() here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2024-01-31 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813003#comment-17813003
 ] 

Fang Yong commented on FLINK-31275:
---

Hi [~mobuchowski] [~ZhenqiuHuang], I feel that there are many details that need 
to be discussed regarding the interfaces. Would it be convenient for you to 
send an email to me (zjur...@gmail.com) so that we can schedule an offline 
meeting to discuss it clearly first, and then initiate a new discussion in the 
dev mailing list. Hope to hear from you, thanks :)

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34090) Introduce SerializerConfig for serialization

2024-01-15 Thread Fang Yong (Jira)
Fang Yong created FLINK-34090:
-

 Summary: Introduce SerializerConfig for serialization
 Key: FLINK-34090
 URL: https://issues.apache.org/jira/browse/FLINK-34090
 Project: Flink
  Issue Type: Sub-task
  Components: API / Type Serialization System
Affects Versions: 1.19.0
Reporter: Fang Yong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34037) FLIP-398: Improve Serialization Configuration And Usage In Flink

2024-01-08 Thread Fang Yong (Jira)
Fang Yong created FLINK-34037:
-

 Summary: FLIP-398: Improve Serialization Configuration And Usage 
In Flink
 Key: FLINK-34037
 URL: https://issues.apache.org/jira/browse/FLINK-34037
 Project: Flink
  Issue Type: Improvement
  Components: API / Type Serialization System, Runtime / Configuration
Affects Versions: 1.19.0
Reporter: Fang Yong


Improve serialization in 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-398%3A+Improve+Serialization+Configuration+And+Usage+In+Flink



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33682) Reuse source operator input records/bytes metrics for SourceOperatorStreamTask

2023-11-28 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33682:
-

Assignee: Zhanghao Chen

> Reuse source operator input records/bytes metrics for SourceOperatorStreamTask
> --
>
> Key: FLINK-33682
> URL: https://issues.apache.org/jira/browse/FLINK-33682
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>
> For SourceOperatorStreamTask, source opeartor is the head operator that takes 
> input. We can directly reuse source operator input records/bytes metrics for 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-26 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789890#comment-17789890
 ] 

Fang Yong commented on FLINK-31275:
---

[~mobuchowski] Sorry for the late reply.

What does the `DataSet` in `LineageVertex` use for? Why a `LineageVertex` have 
multiple inputs or outputs? We hope that 'LineageVertex' describes a single 
source or sink, rather than multiple. So I prefer to add `Map` 
to `LineageVertex` directly to describe the particular information. We 
introduce `LineageEdge` in this FLIP to describe the relation between sources 
and sinks instead of add `input` or `output` in `LineageVertex`.



> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33626) Wrong style in flink ui

2023-11-23 Thread Fang Yong (Jira)
Fang Yong created FLINK-33626:
-

 Summary: Wrong style in flink ui
 Key: FLINK-33626
 URL: https://issues.apache.org/jira/browse/FLINK-33626
 Project: Flink
  Issue Type: Bug
  Components: Travis
Affects Versions: 1.19.0
Reporter: Fang Yong
 Attachments: image-2023-11-23-16-06-44-000.png

https://nightlies.apache.org/flink/flink-docs-master/

 !image-2023-11-23-16-06-44-000.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-14 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17785782#comment-17785782
 ] 

Fang Yong commented on FLINK-31275:
---

[~mobuchowski] Sorry for the late reply, I think I get your point now and thank 
you for your valuable suggestions.

Based on our discussion, I think we can add facets in the `LineageVertex` as 
follows.

{code:java}
public interface LineageVertex {
   /* Facets for the lineage vertex to describe the information of source/sink. 
*/
Map facets;
/* Config for the lineage vertex contains all the options for the 
connector. */
Map config;
}
{code}

We can implement some common facets such as `ConnectorFacet`, `StreamingFacet`, 
`SchemaFacet`. We can add new `Facet` implementations as needed in the future.
For `TableLineageVertex` we can create facets from config in table ddl which 
will make the table options meaningful.
For datastream jobs we can create facts from data source and sink.

{code:java}
/** Connector facet has name and address for the connector, for example, jdbc 
and url for jdbc connector, kafka and server for kafka connector. */
public interface ConnectorFacet extends Facet {
String name();
String address();
}

/** Fact for streaming connector. */
public interface StreamingFacet {
String topic();
String group();
String format();
String start();
}

/** Fact for schema in connector. */
public interface SchemaFacet {
Map fields();
}
{code}

What do you think of this? Thanks



> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-06 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783147#comment-17783147
 ] 

Fang Yong edited comment on FLINK-31275 at 11/6/23 9:17 AM:


Hi [~mobuchowski] Thanks for your reply.

I think our ideas are consistent, just at different levels of abstraction. The 
interface `LineageVertex` is the top interface for connectors in Flink, and we 
implement `TableLineageVertex` for tables, because a Table is a complete 
definition, including the database, schema, etc. We put the options in the 
`with` into a map, which is consistent with the definition and usage habits of 
SQL in Flink.

For the official Flink connectors, we will implement the `LineageVertex` for 
`Source` and `InputFormat` for `DataStream` jobs, such as `KafkaSourceLineage`, 
etc, as we mentioned in FLINK: `We will implement LineageVertexProvider  for 
the builtin source and sink such as KafkaSource , HiveSource , 
FlinkKafkaProducerBase  and etc.`.
End-users don't need to implement them. In order to be consistent with the 
usage habits of tables, we will put the corresponding information into a map 
when implementing it, and users can obtain it.

So, I think our current point of divergence is which level of abstraction the 
user needs to perceive. In the current FLIP, for DataStream jobs, listener 
developers need to identify whether the `LineageVertex` is a 
`KafkaSourceLineageVertex` or a `JdbcLineageVertex`. You mean we need to define 
another layer, such as the `DataSetConfig` interface, and then the listener 
developer can identify whether it is a `KafkaDataSetConfig` or a 
`JdbcDataSetConfig`, right?

Our current use of `LineageVertex` is mainly to consider flexibility and 
facilitate the addition of returned information in the lineage vertex of the 
`DataStream`, such as the vector type data source information mentioned in the 
FLIP example. At the same time, connector maintainers can also easily provide 
lineage vertex for customized connectors. If the connector is in table format, 
we prefer that users directly provide a TableLineageVertex instance.






was (Author: zjureel):
Hi [~mobuchowski] Thanks for your reply.

I think our ideas are consistent, just at different levels of abstraction. The 
interface `LineageVertex` is the top interface for connectors in Flink, and we 
implement `TableLineageVertex` for tables, because a Table is a complete 
definition, including the database, schema, etc. We put the options in the 
`with` into a map, which is consistent with the definition and usage habits of 
SQL in Flink.

For the official Flink connectors, we will implement the `LineageVertex` for 
`Source` and `InputFormat` for `DataStream` jobs, such as `KafkaSourceLineage`, 
etc, as we mentioned in FLINK: `We will implement LineageVertexProvider  for 
the builtin source and sink such as KafkaSource , HiveSource , 
FlinkKafkaProducerBase  and etc.`.
End-users don't need to implement them. In order to be consistent with the 
usage habits of tables, we will put the corresponding information into a map 
when implementing it, and users can obtain it.

So, I think our current point of divergence is which level of abstraction the 
user needs to perceive. In the current FLIP, for DataStream jobs, listener 
developers need to identify whether the `LineageVertex` is a 
`KafkaSourceLineageVertex` or a `JdbcLineageVertex`. You mean we need to define 
another layer, such as the `DataSetConfig` interface, and then the listener 
developer can identify whether it is a `KafkaDataSetConfig` or a 
`JdbcDataSetConfig`, right?

Our current use of `LineageVertexis` mainly to consider flexibility and 
facilitate the addition of returned information in the lineage vertex of the 
`DataStream`, such as the vector type data source information mentioned in the 
FLIP example. At the same time, connector maintainers can also easily provide 
lineage vertex for customized connectors. If the connector is in table format, 
we prefer that users directly provide a TableLineageVertex instance.





> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-06 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783147#comment-17783147
 ] 

Fang Yong commented on FLINK-31275:
---

Hi [~mobuchowski] Thanks for your reply.

I think our ideas are consistent, just at different levels of abstraction. The 
interface `LineageVertex` is the top interface for connectors in Flink, and we 
implement `TableLineageVertex` for tables, because a Table is a complete 
definition, including the database, schema, etc. We put the options in the 
`with` into a map, which is consistent with the definition and usage habits of 
SQL in Flink.

For the official Flink connectors, we will implement the `LineageVertex` for 
`Source` and `InputFormat` for `DataStream` jobs, such as `KafkaSourceLineage`, 
etc, as we mentioned in FLINK: `We will implement LineageVertexProvider  for 
the builtin source and sink such as KafkaSource , HiveSource , 
FlinkKafkaProducerBase  and etc.`.
End-users don't need to implement them. In order to be consistent with the 
usage habits of tables, we will put the corresponding information into a map 
when implementing it, and users can obtain it.

So, I think our current point of divergence is which level of abstraction the 
user needs to perceive. In the current FLIP, for DataStream jobs, listener 
developers need to identify whether the `LineageVertex` is a 
`KafkaSourceLineageVertex` or a `JdbcLineageVertex`. You mean we need to define 
another layer, such as the `DataSetConfig` interface, and then the listener 
developer can identify whether it is a `KafkaDataSetConfig` or a 
`JdbcDataSetConfig`, right?

Our current use of `LineageVertexis` mainly to consider flexibility and 
facilitate the addition of returned information in the lineage vertex of the 
`DataStream`, such as the vector type data source information mentioned in the 
FLIP example. At the same time, connector maintainers can also easily provide 
lineage vertex for customized connectors. If the connector is in table format, 
we prefer that users directly provide a TableLineageVertex instance.





> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-03 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782461#comment-17782461
 ] 

Fang Yong commented on FLINK-31275:
---

[~ZhenqiuHuang] Sounds nice to me, I think we can consider this together. If 
there are some common connectors and relevant schema for the `DataStream` jobs, 
we can define these lineage vertex in flink, such as `ColumnsLineageVertex` 
which has multiple columns with different data type?

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-01 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781613#comment-17781613
 ] 

Fang Yong edited comment on FLINK-31275 at 11/1/23 9:04 AM:


Hi [~mobuchowski], thanks for your comments. In the currently FLIP the 
`LineageVertex` is the top interface for vertexes in lineage graph, it will be 
used in flink sql jobs and datastream jobs. 

1. For table connectors in sql jobs, there will be `TableLineageVertex` which 
is generated from flink catalog based table and provides catalog context, table 
schema for specified connector. The table lineage vertex and edge 
implementations will be created from dynamic tables for connectors in flink, 
and they will be updated when the connectors are updated.

2. For customized source/sink in datastream jobs, we can get source and slink 
`LineageVertex` implementations from `LineageVertexProvider`. When users 
implement customized lineage vertex and edge, they need to update them when 
their connectors are updated.

IIUC, do you mean we should give an implementation of `LineageVertex` for 
datastream jobs and users can provide source/sink information there just like 
`TableLinageVertex` in sql jobs? Then listeners can use the datastream lineage 
vertex which is similar with table lineage vertex? 

Due to the flexibility of the source and sink in `DataStream`, we think it's 
hard to cover all of them, so we just provide `LineageVertex` and 
`LineageVertexProvider` for them. So we left this flexibility to users and 
listeners. If a custom connector is a table in `DataStream` job, users can 
return `TableLineageVertex` in the `LineageVertexProvider`.

And for the following `LineageVertex`
{code:java}
public interface LineageVertex {
/* Config for the lineage vertex contains all the options for the 
connector. */
Map config();
/* List of datasets that are consumed by this job */
List inputs();
/* List of datasets that are produced by this job */
List outputs();
}
{code}

We tend to provide independent edge descriptions of connectors in `LineageEdge` 
for lineage graph instead of adding dataset in `LineageVertex`. The 
`LineageVertex` here is the `DataSet` you mentioned.

WDYT? Hope to hear from you, thanks







was (Author: zjureel):
Hi [~mobuchowski], thanks for your comments. In the currently FLIP the 
`LineageVertex` is the top interface for vertexes in lineage graph, it will be 
used in flink sql jobs and datastream jobs. 

1. For table connectors in sql jobs, there will be `TableLineageVertex` which 
is generated from flink catalog based table and provides catalog context, table 
schema for specified connector. The table lineage vertex and edge 
implementations will be created from dynamic tables for connectors in flink, 
and they will be updated when the connectors are updated.

2. For customized source/sink in datastream jobs, we can get source and slink 
`LineageVertex` implementations from `LineageVertexProvider`. When users 
implement customized lineage vertex and edge, they need to update them when 
their connectors are updated.

IIUC, do you mean we should give an implementation of `LineageVertex` for 
datastream jobs and users can provide source/sink information there just like 
`TableLinageVertex` in sql jobs? Then listeners can use the datastream lineage 
vertex which is similar with table lineage vertex? 

Due to the flexibility of the source and sink in `DataStream`, we think it's 
hard to cover all of them, so we just provide `LineageVertex` and 
`LineageVertexProvider` for them. So we left this flexibility to users and 
listeners. If a custom connector is a table in `DataStream` job, users can 
return `TableLineageVertex` in the `LineageVertexProvider`.

And for the following `LineageVertex`
```
public interface LineageVertex {
/* Config for the lineage vertex contains all the options for the 
connector. */
Map config();
/* List of datasets that are consumed by this job */
List inputs();
/* List of datasets that are produced by this job */
List outputs();
}
```
We tend to provide independent edge descriptions of connectors in `LineageEdge` 
for lineage graph instead of adding dataset in `LineageVertex`. The 
`LineageVertex` here is the `DataSet` you mentioned.

WDYT? Hope to hear from you, thanks






> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> 

[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-11-01 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781613#comment-17781613
 ] 

Fang Yong commented on FLINK-31275:
---

Hi [~mobuchowski], thanks for your comments. In the currently FLIP the 
`LineageVertex` is the top interface for vertexes in lineage graph, it will be 
used in flink sql jobs and datastream jobs. 

1. For table connectors in sql jobs, there will be `TableLineageVertex` which 
is generated from flink catalog based table and provides catalog context, table 
schema for specified connector. The table lineage vertex and edge 
implementations will be created from dynamic tables for connectors in flink, 
and they will be updated when the connectors are updated.

2. For customized source/sink in datastream jobs, we can get source and slink 
`LineageVertex` implementations from `LineageVertexProvider`. When users 
implement customized lineage vertex and edge, they need to update them when 
their connectors are updated.

IIUC, do you mean we should give an implementation of `LineageVertex` for 
datastream jobs and users can provide source/sink information there just like 
`TableLinageVertex` in sql jobs? Then listeners can use the datastream lineage 
vertex which is similar with table lineage vertex? 

Due to the flexibility of the source and sink in `DataStream`, we think it's 
hard to cover all of them, so we just provide `LineageVertex` and 
`LineageVertexProvider` for them. So we left this flexibility to users and 
listeners. If a custom connector is a table in `DataStream` job, users can 
return `TableLineageVertex` in the `LineageVertexProvider`.

And for the following `LineageVertex`
```
public interface LineageVertex {
/* Config for the lineage vertex contains all the options for the 
connector. */
Map config();
/* List of datasets that are consumed by this job */
List inputs();
/* List of datasets that are produced by this job */
List outputs();
}
```
We tend to provide independent edge descriptions of connectors in `LineageEdge` 
for lineage graph instead of adding dataset in `LineageVertex`. The 
`LineageVertex` here is the `DataSet` you mentioned.

WDYT? Hope to hear from you, thanks






> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-10-23 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778890#comment-17778890
 ] 

Fang Yong commented on FLINK-31275:
---

[~ZhenqiuHuang] Sorry for the late reply, and please feel free to comment the 
issues if you have any idea or would like to take it, thanks

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33210) Introduce lineage graph relevant interfaces

2023-10-17 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33210:
-

Assignee: Fang Yong

> Introduce lineage graph relevant interfaces 
> 
>
> Key: FLINK-33210
> URL: https://issues.apache.org/jira/browse/FLINK-33210
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> Introduce LineageGraph, LineageVertex and LineageEdge interfaces



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33166) Support setting root logger level by config

2023-10-11 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-33166.
-
Resolution: Fixed

Fixed by b14411e7062c235cb28582a37e37decc7c425476

> Support setting root logger level by config
> ---
>
> Key: FLINK-33166
> URL: https://issues.apache.org/jira/browse/FLINK-33166
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>  Labels: pull-request-available
>
> Users currently cannot change logging level by config and have to modify the 
> cumbersome logger configuration file manually. We'd better provide a shortcut 
> and support setting root logger level by config. There're a number configs 
> already to set logger configurations, like {{env.log.dir}} for logging dir, 
> {{env.log.max}} for max number of old logging file to save. We can name the 
> new config {{{}env.log.level{}}}.
> Multiple loggers are configured in the configuration files, some with 
> different logging level from the root logger to reduce irrelevant logs. In 
> most cases, only the root logger is relevant. We can make {{env.log.level}} 
> applies to the root logger only for simplicity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33221) Add config options for administrator JVM options

2023-10-11 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33221:
-

Assignee: Zhanghao Chen

> Add config options for administrator JVM options
> 
>
> Key: FLINK-33221
> URL: https://issues.apache.org/jira/browse/FLINK-33221
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>
> We encounter similar issues described in SPARK-23472. Users may need to add 
> JVM options to their Flink applications (e.g. to tune GC options). They 
> typically use {{env.java.opts.x}} series of options to do so. We also have a 
> set of administrator JVM options to apply by default, e.g. to enable GC log, 
> tune GC options, etc. Both use cases will need to set the same series of 
> options and will clobber one another.
> In the past, we generated and pretended to the administrator JVM options in 
> the Java code for generating the starting command for JM/TM. However, this 
> has been proven to be difficult to maintain.
> Therefore, I propose to also add a set of default JVM options for 
> administrator use that prepends the user-set extra JVM options. We can mark 
> the existing {{env.java.opts.x}} series as user-set extra JVM options and add 
> a set of new {{env.java.opts.x.default}} options for administrator use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32309) Shared classpaths and jars manager for jobs in sql gateway cause confliction

2023-10-09 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32309:
-

Assignee: Fang Yong  (was: FangYong)

> Shared classpaths and jars manager for jobs in sql gateway cause confliction
> 
>
> Key: FLINK-32309
> URL: https://issues.apache.org/jira/browse/FLINK-32309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>
> Current all jobs in the same session of sql gateway will share the resource 
> manager which provide the classpath for jobs. After a job is performed, it's 
> classpath and jars will be in the shared resource manager which are used by 
> the next jobs. It may cause too many unnecessary jars in a job or even cause 
> confliction 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33205) Replace Akka with Pekko in the description of "pekko.ssl.enabled"

2023-10-08 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-33205.
-
  Assignee: Zhanghao Chen
Resolution: Fixed

Fixed by d652238e84037a25eb0cef137f34edb37d9e1ac2

> Replace Akka with Pekko in the description of "pekko.ssl.enabled"
> -
>
> Key: FLINK-33205
> URL: https://issues.apache.org/jira/browse/FLINK-33205
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Configuration
>Affects Versions: 1.18.0
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Current description: "Turns on SSL for Akka’s remote communication. This is 
> applicable only when the global ssl flag security.ssl.enabled is set to 
> true." "Akka" should be replaced with Pekko".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33212) Introduce job status changed listener for lineage

2023-10-08 Thread Fang Yong (Jira)
Fang Yong created FLINK-33212:
-

 Summary: Introduce job status changed listener for lineage
 Key: FLINK-33212
 URL: https://issues.apache.org/jira/browse/FLINK-33212
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / REST
Affects Versions: 1.19.0
Reporter: Fang Yong


Introduce job status changed listener relevant interfaces



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33211) Implement table lineage graph

2023-10-08 Thread Fang Yong (Jira)
Fang Yong created FLINK-33211:
-

 Summary: Implement table lineage graph
 Key: FLINK-33211
 URL: https://issues.apache.org/jira/browse/FLINK-33211
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Runtime
Affects Versions: 1.19.0
Reporter: Fang Yong


Implement table lineage graph



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33210) Introduce lineage graph relevant interfaces

2023-10-08 Thread Fang Yong (Jira)
Fang Yong created FLINK-33210:
-

 Summary: Introduce lineage graph relevant interfaces 
 Key: FLINK-33210
 URL: https://issues.apache.org/jira/browse/FLINK-33210
 Project: Flink
  Issue Type: Sub-task
  Components: API / DataStream, Table SQL / API
Affects Versions: 1.19.0
Reporter: Fang Yong


Introduce LineageGraph, LineageVertex and LineageEdge interfaces



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-10-06 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-31275:
-

Assignee: Fang Yong

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-10-06 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-31275:
--
Description: FLIP-314 has been accepted 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener
  (was: Currently flink generates job id in `JobGraph` which can identify a 
job. On the other hand, flink create source/sink table in planner. We need to 
create relations between source and sink tables for the job with an identifier 
id)

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

2023-10-06 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17772734#comment-17772734
 ] 

Fang Yong commented on FLINK-31275:
---

[~ZhenqiuHuang] Thanks for your attention, FLIP-314 is on voting 
https://lists.apache.org/thread/dxdqjc0dd8rf1vbdg755zo1n2zj1tj8d

> Flink supports reporting and storage of source/sink tables relationship
> ---
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Priority: Major
>
> Currently flink generates job id in `JobGraph` which can identify a job. On 
> the other hand, flink create source/sink table in planner. We need to create 
> relations between source and sink tables for the job with an identifier id



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33166) Support setting root logger level by config

2023-09-28 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33166:
-

Assignee: Zhanghao Chen

> Support setting root logger level by config
> ---
>
> Key: FLINK-33166
> URL: https://issues.apache.org/jira/browse/FLINK-33166
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>
> Users currently cannot change logging level by config and have to modify the 
> cumbersome logger configuration file manually. We'd better provide a shortcut 
> and support setting root logger level by config. There're a number configs 
> already to set logger configurations, like {{env.log.dir}} for logging dir, 
> {{env.log.max}} for max number of old logging file to save. We can name the 
> new config {{{}env.log.level{}}}.
> Multiple loggers are configured in the configuration files, some with 
> different logging level from the root logger to reduce irrelevant logs. In 
> most cases, only the root logger is relevant. We can make {{env.log.level}} 
> applies to the root logger only for simplicity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32405) Initialize catalog listener for CatalogManager

2023-09-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32405.
-
Resolution: Duplicate

Duplicate with FLINK-32404

> Initialize catalog listener for CatalogManager
> --
>
> Key: FLINK-32405
> URL: https://issues.apache.org/jira/browse/FLINK-32405
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: FangYong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32405) Initialize catalog listener for CatalogManager

2023-09-20 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767354#comment-17767354
 ] 

Fang Yong commented on FLINK-32405:
---

OK, it is a duplicate issue with FLINK-32404

> Initialize catalog listener for CatalogManager
> --
>
> Key: FLINK-32405
> URL: https://issues.apache.org/jira/browse/FLINK-32405
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: FangYong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (FLINK-32405) Initialize catalog listener for CatalogManager

2023-09-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reopened FLINK-32405:
---

> Initialize catalog listener for CatalogManager
> --
>
> Key: FLINK-32405
> URL: https://issues.apache.org/jira/browse/FLINK-32405
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: FangYong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32848) [JUnit5 Migration] The persistence, query, registration, rpc and shuffle packages of flink-runtime module

2023-09-20 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32848.
-
Resolution: Fixed

Fixed by 
2ced46b6ed4ebfb20f9b392cba4e789e16e4da7d...f2cb1d247283344e9194e63931a2948e09f73c93

> [JUnit5 Migration] The persistence, query, registration, rpc and shuffle 
> packages of flink-runtime module
> -
>
> Key: FLINK-32848
> URL: https://issues.apache.org/jira/browse/FLINK-32848
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Rui Fan
>Assignee: Zhanghao Chen
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33033) Add haservice micro benchmark for olap

2023-09-17 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33033:
-

Assignee: Fang Yong

> Add haservice micro benchmark for olap
> --
>
> Key: FLINK-33033
> URL: https://issues.apache.org/jira/browse/FLINK-33033
> Project: Flink
>  Issue Type: Sub-task
>  Components: Benchmarks
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> Add micro benchmarks of haservice for olap to improve the performance for 
> short-lived jobs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33051) GlobalFailureHandler interface should be retired in favor of LabeledGlobalFailureHandler

2023-09-13 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-33051:
-

Assignee: Matt Wang

> GlobalFailureHandler interface should be retired in favor of 
> LabeledGlobalFailureHandler
> 
>
> Key: FLINK-33051
> URL: https://issues.apache.org/jira/browse/FLINK-33051
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Panagiotis Garefalakis
>Assignee: Matt Wang
>Priority: Minor
>
> FLIP-304 introduced `LabeledGlobalFailureHandler` interface that is an 
> extension of `GlobalFailureHandler` interface.  The later can thus be removed 
> in the future to avoid the existence of interfaces with duplicate functions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32396) Support timestamp for jdbc driver and gateway

2023-09-11 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32396.
-
Fix Version/s: 1.19.0
   Resolution: Fixed

Fixed by c33a6527d8383dc571e0b648b8a29322416ab9d6

> Support timestamp for jdbc driver and gateway
> -
>
> Key: FLINK-32396
> URL: https://issues.apache.org/jira/browse/FLINK-32396
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / JDBC
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
> Fix For: 1.19.0
>
>
> Support timestamp and timestamp_ltz data type for jdbc driver and sql-gateway



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33033) Add haservice micro benchmark for olap

2023-09-04 Thread Fang Yong (Jira)
Fang Yong created FLINK-33033:
-

 Summary: Add haservice micro benchmark for olap
 Key: FLINK-33033
 URL: https://issues.apache.org/jira/browse/FLINK-33033
 Project: Flink
  Issue Type: Sub-task
  Components: Benchmarks
Affects Versions: 1.19.0
Reporter: Fang Yong


Add micro benchmarks of haservice for olap to improve the performance for 
short-lived jobs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-25356) Add benchmarks for performance in OLAP scenarios

2023-09-04 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-25356:
--
Parent: (was: FLINK-25318)
Issue Type: Technical Debt  (was: Sub-task)

> Add benchmarks for performance in OLAP scenarios
> 
>
> Key: FLINK-25356
> URL: https://issues.apache.org/jira/browse/FLINK-25356
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Benchmarks
>Reporter: Xintong Song
>Priority: Major
>
> As discussed in FLINK-25318, we would need a unified, public visible 
> benchmark setups, for supporting OLAP performance improvements and 
> investigations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-09-04 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761807#comment-17761807
 ] 

Fang Yong commented on FLINK-32667:
---

[~mapohl] As we discussed in the threads 
https://lists.apache.org/thread/2wj1wfzcg162534v8olqt18y2x9x99od and 
https://lists.apache.org/thread/szdr4ngrfcmo7zko4917393zbqhgw0v5, we have came 
to an agreement about flink olap and [~jark] has already added the relevant 
content in flink roadmap https://flink.apache.org/roadmap/ , so I think we can 
continue to promote olap related issues.

Regarding this issue, I would like to synchronize our current progress. We do a 
simple e2e test internally and we found that the latency of the simplest query 
with HA is 6.5 times higher than that without HA. I think we can add this olap 
micro benchmarks in flink-benchmarks for HA service and then continue this 
issue. After that, we can add a micro benchmark for the new interfaces to 
compare to the previous one. What do you think? Thanks

> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP

2023-09-03 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32755:
--
Fix Version/s: 1.19.0

> Add quick start guide for Flink OLAP
> 
>
> Key: FLINK-32755
> URL: https://issues.apache.org/jira/browse/FLINK-32755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> I propose to add a new {{QUICKSTART.md}} guide that provides instructions for 
> beginner to build a production ready Flink OLAP Service by using 
> flink-jdbc-driver, flink-sql-gateway and flink session cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32755) Add quick start guide for Flink OLAP

2023-09-03 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32755.
-
Resolution: Fixed

Fixed by fbef3c22757a2352145599487beb84e02aaeb389

> Add quick start guide for Flink OLAP
> 
>
> Key: FLINK-32755
> URL: https://issues.apache.org/jira/browse/FLINK-32755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> I propose to add a new {{QUICKSTART.md}} guide that provides instructions for 
> beginner to build a production ready Flink OLAP Service by using 
> flink-jdbc-driver, flink-sql-gateway and flink session cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32749) Sql gateway supports default catalog loaded by CatalogStore

2023-09-01 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32749.
-
Fix Version/s: 1.19.0
   Resolution: Fixed

Fixed by 1797a70c0074b9e0cbfd20abc2c33050a7906578

> Sql gateway supports default catalog loaded by CatalogStore
> ---
>
> Key: FLINK-32749
> URL: https://issues.apache.org/jira/browse/FLINK-32749
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Gateway
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> Currently sql gateway will create memory catalog as default catalog, it 
> should support default catalog loaded by catalog store



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP

2023-08-31 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32755:
--
Parent: FLINK-32898
Issue Type: Sub-task  (was: Improvement)

> Add quick start guide for Flink OLAP
> 
>
> Key: FLINK-32755
> URL: https://issues.apache.org/jira/browse/FLINK-32755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> I propose to add a new {{QUICKSTART.md}} guide that provides instructions for 
> beginner to build a production ready Flink OLAP Service by using 
> flink-jdbc-driver, flink-sql-gateway and flink session cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP

2023-08-31 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32755:
--
Parent: (was: FLINK-32898)
Issue Type: Improvement  (was: Sub-task)

> Add quick start guide for Flink OLAP
> 
>
> Key: FLINK-32755
> URL: https://issues.apache.org/jira/browse/FLINK-32755
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> I propose to add a new {{QUICKSTART.md}} guide that provides instructions for 
> beginner to build a production ready Flink OLAP Service by using 
> flink-jdbc-driver, flink-sql-gateway and flink session cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP

2023-08-31 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32755:
--
Parent: FLINK-32898
Issue Type: Sub-task  (was: Improvement)

> Add quick start guide for Flink OLAP
> 
>
> Key: FLINK-32755
> URL: https://issues.apache.org/jira/browse/FLINK-32755
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> I propose to add a new {{QUICKSTART.md}} guide that provides instructions for 
> beginner to build a production ready Flink OLAP Service by using 
> flink-jdbc-driver, flink-sql-gateway and flink session cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP

2023-08-31 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32755:
--
Parent: (was: FLINK-25318)
Issue Type: Improvement  (was: Sub-task)

> Add quick start guide for Flink OLAP
> 
>
> Key: FLINK-32755
> URL: https://issues.apache.org/jira/browse/FLINK-32755
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: xiangyu feng
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available
>
> I propose to add a new {{QUICKSTART.md}} guide that provides instructions for 
> beginner to build a production ready Flink OLAP Service by using 
> flink-jdbc-driver, flink-sql-gateway and flink session cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32968) Fix doc for customized catalog listener

2023-08-31 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32968.
-
Fix Version/s: 1.18.0
   1.19.0
   Resolution: Fixed

Fixed by d62feeca45290da610aa8aa1790e80f57397cd6d

> Fix doc for customized catalog listener
> ---
>
> Key: FLINK-32968
> URL: https://issues.apache.org/jira/browse/FLINK-32968
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0, 1.19.0
>
>
> Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32396) Support timestamp for jdbc driver and gateway

2023-08-31 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32396:
-

Assignee: Fang Yong

> Support timestamp for jdbc driver and gateway
> -
>
> Key: FLINK-32396
> URL: https://issues.apache.org/jira/browse/FLINK-32396
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / JDBC
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available, stale-major
>
> Support timestamp and timestamp_ltz data type for jdbc driver and sql-gateway



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32749) Sql gateway supports default catalog loaded by CatalogStore

2023-08-29 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32749:
-

Assignee: Fang Yong

> Sql gateway supports default catalog loaded by CatalogStore
> ---
>
> Key: FLINK-32749
> URL: https://issues.apache.org/jira/browse/FLINK-32749
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Gateway
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> Currently sql gateway will create memory catalog as default catalog, it 
> should support default catalog loaded by catalog store



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32968) Fix doc for customized catalog listener

2023-08-28 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32968:
-

Assignee: Fang Yong

> Fix doc for customized catalog listener
> ---
>
> Key: FLINK-32968
> URL: https://issues.apache.org/jira/browse/FLINK-32968
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32968) Fix doc for customized catalog listener

2023-08-28 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32968:
--
Summary: Fix doc for customized catalog listener  (was: Update doc for 
customized catalog listener)

> Fix doc for customized catalog listener
> ---
>
> Key: FLINK-32968
> URL: https://issues.apache.org/jira/browse/FLINK-32968
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Fang Yong
>Priority: Major
>
> Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32968) Update doc for customized catalog listener

2023-08-27 Thread Fang Yong (Jira)
Fang Yong created FLINK-32968:
-

 Summary: Update doc for customized catalog listener
 Key: FLINK-32968
 URL: https://issues.apache.org/jira/browse/FLINK-32968
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.18.0, 1.19.0
Reporter: Fang Yong


Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener

2023-08-25 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759072#comment-17759072
 ] 

Fang Yong commented on FLINK-32798:
---

[~ruanhang1993] Thanks for your suggestions and I will create an issue for the 
docs cc [~renqs]

> Release Testing: Verify FLIP-294: Support Customized Catalog Modification 
> Listener
> --
>
> Key: FLINK-32798
> URL: https://issues.apache.org/jira/browse/FLINK-32798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Assignee: Hang Ruan
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: result.png, sqls.png, test.png
>
>
> The document about catalog modification listener is: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener

2023-08-23 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758019#comment-17758019
 ] 

Fang Yong commented on FLINK-32798:
---

[~ruanhang1993] I checked the implementation in sql-gateway and there's an 
issue in the document  
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener
 that users can not set the listener with `SET` for sql-gateway. Users should 
add the option `table.catalog-modification.listeners` in flink-conf.yaml or use 
it as dynamic parameter for sql-gateway. How can I create a pr for this? cc 
[~renqs]

> Release Testing: Verify FLIP-294: Support Customized Catalog Modification 
> Listener
> --
>
> Key: FLINK-32798
> URL: https://issues.apache.org/jira/browse/FLINK-32798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: test.png
>
>
> The document about catalog modification listener is: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32794) Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway

2023-08-22 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32794:
--
Description: Document for jdbc driver: 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/jdbcdriver/

> Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway
> -
>
> Key: FLINK-32794
> URL: https://issues.apache.org/jira/browse/FLINK-32794
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.18.0
>
>
> Document for jdbc driver: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/jdbcdriver/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32794) Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway

2023-08-22 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757760#comment-17757760
 ] 

Fang Yong commented on FLINK-32794:
---

Thanks [~renqs], DONE

> Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway
> -
>
> Key: FLINK-32794
> URL: https://issues.apache.org/jira/browse/FLINK-32794
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.18.0
>
>
> Document for jdbc driver: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/jdbcdriver/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener

2023-08-22 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757759#comment-17757759
 ] 

Fang Yong commented on FLINK-32798:
---

[~renqs] I have add the document link in the "Description"

> Release Testing: Verify FLIP-294: Support Customized Catalog Modification 
> Listener
> --
>
> Key: FLINK-32798
> URL: https://issues.apache.org/jira/browse/FLINK-32798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.18.0
>
>
> The document about catalog modification listener is: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener

2023-08-22 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32798:
--
Description: The document about catalog modification listener is: 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener

> Release Testing: Verify FLIP-294: Support Customized Catalog Modification 
> Listener
> --
>
> Key: FLINK-32798
> URL: https://issues.apache.org/jira/browse/FLINK-32798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Qingsheng Ren
>Priority: Major
> Fix For: 1.18.0
>
>
> The document about catalog modification listener is: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32653) Add doc for catalog store

2023-08-17 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32653.
-
Fix Version/s: 1.19.0
   Resolution: Fixed

Thanks [~hackergin], fixed by 5628c7097875be4bd56fc7805dbdd727d92bdac7

> Add doc for catalog store
> -
>
> Key: FLINK-32653
> URL: https://issues.apache.org/jira/browse/FLINK-32653
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Feng Jin
>Assignee: Feng Jin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-25015) job name should not always be `collect` submitted by sql client

2023-08-14 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-25015:
-

Assignee: xiangyu feng  (was: KevinyhZou)

> job name should not always be `collect` submitted by sql client
> ---
>
> Key: FLINK-25015
> URL: https://issues.apache.org/jira/browse/FLINK-25015
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: KevinyhZou
>Assignee: xiangyu feng
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-11-23-20-15-32-459.png, 
> image-2021-11-23-20-16-21-932.png
>
>
> I use flink sql client to submitted different sql query to flink session 
> cluster,  and the sql job name is always `collect`,  as below
> !image-2021-11-23-20-16-21-932.png!
> which make no sence to users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31259) Gateway supports initialization of catalog at startup

2023-08-13 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-31259.
-
Resolution: Fixed

See https://issues.apache.org/jira/browse/FLINK-32427

> Gateway supports initialization of catalog at startup
> -
>
> Key: FLINK-31259
> URL: https://issues.apache.org/jira/browse/FLINK-31259
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Gateway
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>
> Support to initializing catalogs in gateway when it starts



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32660) Support external file systems in FileCatalogStore

2023-08-11 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32660.
-
Resolution: Fixed

Thanks [~ferenc-csaky], fixed with f68967a7c938f6258125c3221be74522965d1ff2

> Support external file systems in FileCatalogStore
> -
>
> Key: FLINK-32660
> URL: https://issues.apache.org/jira/browse/FLINK-32660
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Ferenc Csaky
>Assignee: Ferenc Csaky
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-08-11 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17753091#comment-17753091
 ] 

Fang Yong commented on FLINK-32667:
---

[~mapohl] Thanks for your detailed explanation and I caught it. Currently Flink 
will create `FailoverStrategy` for jobs according to the cluster configuration 
and as I mentioned above there are only two strategies: full and region.

I think decoupling `LeaderElection` and `job-related HA data` from 
`HighAvailabilityServices` is a very good solution, that's what we want for 
OLAP queries.
As you mentioned in 
[FLINK-31816|https://issues.apache.org/jira/browse/FLINK-31816?focusedCommentId=17741054=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17741054]:
 
`HighAvailabilityServices could get a single implementation that requires a 
factory method for creating JobGraphStore, JobResultStore, 
CheckpointRecoveryFactory, and BlobStore. Additionally, it would require a 
LeaderElectionService (which is essentially a factory for LeaderElection 
instances)`

I think we can do it now and after that we can add a new failover strategy such 
as `none` for cluster and create embedding factory. What do you think? 
[~mapohl][~chesnay]



> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-08-11 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17753091#comment-17753091
 ] 

Fang Yong edited comment on FLINK-32667 at 8/11/23 8:00 AM:


[~mapohl] Thanks for your detailed explanation and I caught it. Currently Flink 
will create `FailoverStrategy` for jobs according to the cluster configuration 
and as I mentioned above there are only two strategies: full and region.

I think decoupling `LeaderElection` and `job-related HA data` from 
`HighAvailabilityServices` is a very good solution, that's what we want for 
OLAP queries.
As you mentioned in 
[FLINK-31816|https://issues.apache.org/jira/browse/FLINK-31816?focusedCommentId=17741054=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17741054]:
 
```
HighAvailabilityServices could get a single implementation that requires a 
factory method for creating JobGraphStore, JobResultStore, 
CheckpointRecoveryFactory, and BlobStore. Additionally, it would require a 
LeaderElectionService (which is essentially a factory for LeaderElection 
instances)
```

I think we can do it now and after that we can add a new failover strategy such 
as `none` for cluster and create embedding factory. What do you think? 
[~mapohl][~chesnay]




was (Author: zjureel):
[~mapohl] Thanks for your detailed explanation and I caught it. Currently Flink 
will create `FailoverStrategy` for jobs according to the cluster configuration 
and as I mentioned above there are only two strategies: full and region.

I think decoupling `LeaderElection` and `job-related HA data` from 
`HighAvailabilityServices` is a very good solution, that's what we want for 
OLAP queries.
As you mentioned in 
[FLINK-31816|https://issues.apache.org/jira/browse/FLINK-31816?focusedCommentId=17741054=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17741054]:
 
`HighAvailabilityServices could get a single implementation that requires a 
factory method for creating JobGraphStore, JobResultStore, 
CheckpointRecoveryFactory, and BlobStore. Additionally, it would require a 
LeaderElectionService (which is essentially a factory for LeaderElection 
instances)`

I think we can do it now and after that we can add a new failover strategy such 
as `none` for cluster and create embedding factory. What do you think? 
[~mapohl][~chesnay]



> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32747) Fix ddl operations for catalog from CatalogStore

2023-08-09 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752610#comment-17752610
 ] 

Fang Yong commented on FLINK-32747:
---

Fixed with 76554d186ad198384ff783e3e3f040dd738b7571 

> Fix ddl operations for catalog from CatalogStore
> 
>
> Key: FLINK-32747
> URL: https://issues.apache.org/jira/browse/FLINK-32747
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> Fix ddl operations for catalog which loaded by CatalogStore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32747) Fix ddl operations for catalog from CatalogStore

2023-08-09 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32747.
-
Fix Version/s: 1.18.0
   Resolution: Fixed

> Fix ddl operations for catalog from CatalogStore
> 
>
> Key: FLINK-32747
> URL: https://issues.apache.org/jira/browse/FLINK-32747
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Fix ddl operations for catalog which loaded by CatalogStore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-08-08 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752214#comment-17752214
 ] 

Fang Yong commented on FLINK-32667:
---

[~chesnay] In addition to the above solution, I think we can add a new failover 
strategy. Currently there are two failover strategy: full and region, we can 
add a new strategy such as `none` for the jobs and they don't need to store 
relevant information, WDYT?

> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-08-08 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752036#comment-17752036
 ] 

Fang Yong commented on FLINK-32667:
---

[~chesnay] [~mapohl] Currently ha service consists of two parts

Part1: Flink cluster will register its dispatcher address, rest port to ha 
service such as zk or configmap for k8s, then the client can get these 
information and submit job to dispatcher via rest, this is needed in OLAP 
scenario

Part2: Dispatcher will validate, save and recover job from JobGraphStore and 
JobResultStore requires failover.

In OLAP scenario, Part2 stores the jobs to external systems (zk/s3) which may 
cause latency jitter in olap queries, so we only need Part1 not Par2. My 
original idea was to add an option for ha service and it can use embedded 
JobGraphStore and memory JobResultStore, it may be the more suitable solution. 
But it will cause the job will not recover even if it is configured with 
failover when JM crashes.

The second idea is that we do not store jobs without failover to ha service, 
but currently we can only check this according to restart strategy and also the 
issues mentioned by [~chesnay] above. 

What do you think of this, do you think the first idea (add option for ha 
service) is feasible? Looking forward to your feedback, thanks





> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32747) Fix ddl operations for catalog from CatalogStore

2023-08-08 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32747:
--
Description: Fix ddl operations for catalog which loaded by CatalogStore  
(was: Support ddl for catalog which loaded by CatalogStore)

> Fix ddl operations for catalog from CatalogStore
> 
>
> Key: FLINK-32747
> URL: https://issues.apache.org/jira/browse/FLINK-32747
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> Fix ddl operations for catalog which loaded by CatalogStore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32747) Fix ddl operations for catalog from CatalogStore

2023-08-08 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-32747:
--
Summary: Fix ddl operations for catalog from CatalogStore  (was: Support 
ddl for catalog from CatalogStore)

> Fix ddl operations for catalog from CatalogStore
> 
>
> Key: FLINK-32747
> URL: https://issues.apache.org/jira/browse/FLINK-32747
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> Support ddl for catalog which loaded by CatalogStore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-08-04 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17751092#comment-17751092
 ] 

Fang Yong commented on FLINK-32667:
---

[~mapohl] I think performance testing is a good suggestion, we also discussed 
and created FLINK-25356 at the beginning. I will consider how to add e2e 
benchmark in project flink-benchmarks and I think we should to add micro 
benchmarks for primary issue.

Unfortunately, there is currently no ML about the ML for short-living jobs in 
flink. We only discussed this with [~xtsong] off-line when we created 
FLINK-25318, but I strongly agree with you that we need to initiate broader 
discussion in the community dev ML. I will collect our practical experiences 
and initiate a discussion in ML later, thank you very much for your valuable 
suggestion!

> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32749) Sql gateway supports default catalog loaded by CatalogStore

2023-08-03 Thread Fang Yong (Jira)
Fang Yong created FLINK-32749:
-

 Summary: Sql gateway supports default catalog loaded by 
CatalogStore
 Key: FLINK-32749
 URL: https://issues.apache.org/jira/browse/FLINK-32749
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Gateway
Affects Versions: 1.19.0
Reporter: Fang Yong


Currently sql gateway will create memory catalog as default catalog, it should 
support default catalog loaded by catalog store



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-08-03 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750938#comment-17750938
 ] 

Fang Yong commented on FLINK-32667:
---

[~mapohl] We have an internally e2e test for olap queries to statistical 
latency and QPS. However, e2e testing requires some resources. I'm considering 
how to build this test in the community, and whether each important issue can 
be tested in flink-benchmarks project. What do you think of it? Thanks



> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32747) Support ddl for catalog from CatalogStore

2023-08-03 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32747:
-

Assignee: Fang Yong

> Support ddl for catalog from CatalogStore
> -
>
> Key: FLINK-32747
> URL: https://issues.apache.org/jira/browse/FLINK-32747
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> Support ddl for catalog which loaded by CatalogStore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32747) Support ddl for catalog from CatalogStore

2023-08-03 Thread Fang Yong (Jira)
Fang Yong created FLINK-32747:
-

 Summary: Support ddl for catalog from CatalogStore
 Key: FLINK-32747
 URL: https://issues.apache.org/jira/browse/FLINK-32747
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.19.0
Reporter: Fang Yong


Support ddl for catalog which loaded by CatalogStore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-07-28 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32667:
-

Assignee: Fang Yong

> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25015) job name should not always be `collect` submitted by sql client

2023-07-27 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747827#comment-17747827
 ] 

Fang Yong commented on FLINK-25015:
---

Hi [~zouyunhe], are you still working on this issue? I found this issue is 
created years ago. If you are not, we will fix it, thanks.

> job name should not always be `collect` submitted by sql client
> ---
>
> Key: FLINK-25015
> URL: https://issues.apache.org/jira/browse/FLINK-25015
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: KevinyhZou
>Assignee: KevinyhZou
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-11-23-20-15-32-459.png, 
> image-2021-11-23-20-16-21-932.png
>
>
> I use flink sql client to submitted different sql query to flink session 
> cluster,  and the sql job name is always `collect`,  as below
> !image-2021-11-23-20-16-21-932.png!
> which make no sence to users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-25015) job name should not always be `collect` submitted by sql client

2023-07-26 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong updated FLINK-25015:
--
Parent: FLINK-25318
Issue Type: Sub-task  (was: Improvement)

> job name should not always be `collect` submitted by sql client
> ---
>
> Key: FLINK-25015
> URL: https://issues.apache.org/jira/browse/FLINK-25015
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: KevinyhZou
>Assignee: KevinyhZou
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-11-23-20-15-32-459.png, 
> image-2021-11-23-20-16-21-932.png
>
>
> I use flink sql client to submitted different sql query to flink session 
> cluster,  and the sql job name is always `collect`,  as below
> !image-2021-11-23-20-16-21-932.png!
> which make no sence to users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25015) job name should not always be `collect` submitted by sql client

2023-07-26 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747763#comment-17747763
 ] 

Fang Yong commented on FLINK-25015:
---

[~zouyunhe] What's the progress of this issue, there're conflicts in the PR

> job name should not always be `collect` submitted by sql client
> ---
>
> Key: FLINK-25015
> URL: https://issues.apache.org/jira/browse/FLINK-25015
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: KevinyhZou
>Assignee: KevinyhZou
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-11-23-20-15-32-459.png, 
> image-2021-11-23-20-16-21-932.png
>
>
> I use flink sql client to submitted different sql query to flink session 
> cluster,  and the sql job name is always `collect`,  as below
> !image-2021-11-23-20-16-21-932.png!
> which make no sence to users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32698) Add getCheckpointOptions interface in ManagedSnapshotContext

2023-07-26 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747737#comment-17747737
 ] 

Fang Yong commented on FLINK-32698:
---

[~Ming Li] `ManagedSnapshotContext` is a `public` interface and there should be 
a flip if you want to add something in it. 

On the other hand, I found the sink operator in `Paimon` is a 
`OneInputStreamOperator` which is also `public` in flink and it has 
`snapshotState` method which has `snapshotType` in `CheckpointOptions`, maybe 
we can consider it first.

> Add getCheckpointOptions interface in ManagedSnapshotContext
> 
>
> Key: FLINK-32698
> URL: https://issues.apache.org/jira/browse/FLINK-32698
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: Ming Li
>Priority: Major
>
> Currently only {{checkpointID}} and {{checkpointTimestamp}} information are 
> provided in {{{}ManagedSnapshotContext{}}}. We hope to provide more 
> information about {{{}CheckpointOptions{}}}, so that operators can adopt 
> different logics when performing {{{}SnapshotState{}}}.
>  
> An example is to adopt different behaviors according to the type of 
> checkpoint. For example, in {{{}Paimon{}}}, we hope that the paimon‘s 
> snapshot written by {{checkpoint}} can expire automatically, while the 
> paimon‘s snapshot written by {{savepoint}} can be persisted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32402) FLIP-294: Support Customized Catalog Modification Listener

2023-07-26 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747719#comment-17747719
 ] 

Fang Yong commented on FLINK-32402:
---

[~knaufk] Currently I think it's better to add this in `catalogs.md` and users 
can get all information about catalogs in that page, I noticed that some 
advanced usages of catalog are already in `cagalogs.md` such as `User-Defined 
Catalog`. What do you think?

> FLIP-294: Support Customized Catalog Modification Listener
> --
>
> Key: FLINK-32402
> URL: https://issues.apache.org/jira/browse/FLINK-32402
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Ecosystem
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> Issue for 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32676) Add doc for catalog modification listener

2023-07-25 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32676:
-

Assignee: Fang Yong

> Add doc for catalog modification listener
> -
>
> Key: FLINK-32676
> URL: https://issues.apache.org/jira/browse/FLINK-32676
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> Add doc for catalog modification listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32676) Add doc for catalog modification listener

2023-07-25 Thread Fang Yong (Jira)
Fang Yong created FLINK-32676:
-

 Summary: Add doc for catalog modification listener
 Key: FLINK-32676
 URL: https://issues.apache.org/jira/browse/FLINK-32676
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Table SQL / Runtime
Affects Versions: 1.18.0
Reporter: Fang Yong


Add doc for catalog modification listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32402) FLIP-294: Support Customized Catalog Modification Listener

2023-07-25 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747235#comment-17747235
 ] 

Fang Yong commented on FLINK-32402:
---

Thanks [~knaufk], I'll create an issue to add doc for this feature

> FLIP-294: Support Customized Catalog Modification Listener
> --
>
> Key: FLINK-32402
> URL: https://issues.apache.org/jira/browse/FLINK-32402
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Ecosystem
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>
> Issue for 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-07-25 Thread Fang Yong (Jira)
Fang Yong created FLINK-32667:
-

 Summary: Use standalone store and embedding writer for jobs with 
no-restart-strategy in session cluster
 Key: FLINK-32667
 URL: https://issues.apache.org/jira/browse/FLINK-32667
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.18.0
Reporter: Fang Yong


When a flink session cluster use zk or k8s high availability service, it will 
store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
cluster, they always turn off restart strategy. These jobs with 
no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32665) Support read null value for csv format

2023-07-25 Thread Fang Yong (Jira)
Fang Yong created FLINK-32665:
-

 Summary: Support read null value for csv format
 Key: FLINK-32665
 URL: https://issues.apache.org/jira/browse/FLINK-32665
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.18.0
Reporter: Fang Yong


when there is null column in a file with csv format, it will throw exception 
when flink job try to parse these data



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29317) Add WebSocket in Dispatcher to support olap query submission and push results in session cluster

2023-07-24 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746784#comment-17746784
 ] 

Fang Yong commented on FLINK-29317:
---

Hi [~dmvk] Sorry for so late rely. Currently we would like to promote 
improvement about Flink OLAP in the community. Are you still interested in this 
area? We want to create the FLIP and initiate discussions in the community 
later, thanks

> Add WebSocket in Dispatcher to support olap query submission and push results 
> in session cluster
> 
>
> Key: FLINK-29317
> URL: https://issues.apache.org/jira/browse/FLINK-29317
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Runtime / REST
>Affects Versions: 1.14.5, 1.15.3
>Reporter: Fang Yong
>Priority: Major
>
> Currently client submit olap query to flink session cluster via http rest 
> api, and pull the results through interval polling. The sink task in 
> TaskManager creates socket server for each query, when the JobManager 
> receives the pull request from client, it requests query results from the 
> socket server. The process is as follows
> Job submission path:
> client -> http rest -> JobManager -> Sink Socket Server
> Result acquisition path:
> client <- http rest <- JobManager <- Sink Socket Server
> This leads to two problems
> 1. There will be some performance loss when submitting jobs through http 
> rest, for example, temporary files will be created for each job
> 2. The client pulls the result data at a certain time interval, which is a 
> fixed cost. The larger interval leads to increase query latency, the smaller 
> interval will increase the pressure of Dispatcher.
> 3. Each sink task initializes a socket server, it will increase the query 
> latency, on the other hand, it wastes resources.
> For the Flink OLAP scenario, we propose to add websocket protocol in session 
> cluster to support submitting jobs and returning results. The client creates 
> and manage a connection with websocket server, submits olap query to session 
> cluster. The TaskManagers create and manage connection to websocket server 
> too, and sink task sends results to the server in stream. When the JobManager 
> receives the results from sink task, it pushes the result data to the client 
> through the connection between them.
> We implemented this feature in the internal Flink version of ByteDance. On 
> average, the latency of each query can be reduced by about 100ms, it's a big 
> optimization for OLAP queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32643) Introduce off-heap shared state cache across stateful operators in TM

2023-07-21 Thread Fang Yong (Jira)
Fang Yong created FLINK-32643:
-

 Summary: Introduce off-heap shared state cache across stateful 
operators in TM
 Key: FLINK-32643
 URL: https://issues.apache.org/jira/browse/FLINK-32643
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.19.0
Reporter: Fang Yong


Currently each stateful operator will create an independent db instance if it 
uses rocksdb as state backend, and we can configure 
`state.backend.rocksdb.block.cache-size` for each db to speed up state 
performance. This parameter defaults to 8M, and we cannot set it too large, 
such as 512M, this may cause OOM and each DB cannot effectively utilize memory. 
To address this issue, we would like to introduce off-heap shared state cache 
across multiple db instances for stateful operators in TM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32405) Initialize catalog listener for CatalogManager

2023-07-21 Thread Fang Yong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong closed FLINK-32405.
-
Resolution: Fixed

> Initialize catalog listener for CatalogManager
> --
>
> Key: FLINK-32405
> URL: https://issues.apache.org/jira/browse/FLINK-32405
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: FangYong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32633) Kubernetes e2e test is not stable

2023-07-19 Thread Fang Yong (Jira)
Fang Yong created FLINK-32633:
-

 Summary: Kubernetes e2e test is not stable
 Key: FLINK-32633
 URL: https://issues.apache.org/jira/browse/FLINK-32633
 Project: Flink
  Issue Type: Technical Debt
  Components: Deployment / Kubernetes, Kubernetes Operator
Affects Versions: 1.18.0
Reporter: Fang Yong


The output file is: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51444=logs=bea52777-eaf8-5663-8482-18fbc3630e81=43ba8ce7-ebbf-57cd-9163-444305d74117

Jul 19 17:06:02 Stopping minikube ...
Jul 19 17:06:02 * Stopping node "minikube"  ...
Jul 19 17:06:13 * 1 node stopped.
Jul 19 17:06:13 [FAIL] Test script contains errors.
Jul 19 17:06:13 Checking for errors...
Jul 19 17:06:13 No errors in log files.
Jul 19 17:06:13 Checking for exceptions...
Jul 19 17:06:13 No exceptions in log files.
Jul 19 17:06:13 Checking for non-empty .out files...
grep: /home/vsts/work/_temp/debug_files/flink-logs/*.out: No such file or 
directory
Jul 19 17:06:13 No non-empty .out files.
Jul 19 17:06:13 
Jul 19 17:06:13 [FAIL] 'Run Kubernetes test' failed after 4 minutes and 28 
seconds! Test exited with exit code 1
Jul 19 17:06:13 
17:06:13 ##[group]Environment Information
Jul 19 17:06:13 Jps




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32622) Do not add mini-batch assigner operator if it is useless

2023-07-18 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17744215#comment-17744215
 ] 

Fang Yong commented on FLINK-32622:
---

cc [~libenchao] [~zoudan]

> Do not add mini-batch assigner operator if it is useless
> 
>
> Key: FLINK-32622
> URL: https://issues.apache.org/jira/browse/FLINK-32622
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Fang Yong
>Priority: Major
>
> Currently if user config mini-batch for their sql jobs, flink will always add 
> mini-batch assigner operator in job plan even there's no agg/join operators 
> in the job. Mini-batch operator will generate useless event and cause 
> performance issue for them. If the mini-batch is useless for the specific 
> jobs, flink should not add mini-batch assigner even when users turn on 
> mini-batch mechanism. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32622) Do not add mini-batch assigner operator if it is useless

2023-07-18 Thread Fang Yong (Jira)
Fang Yong created FLINK-32622:
-

 Summary: Do not add mini-batch assigner operator if it is useless
 Key: FLINK-32622
 URL: https://issues.apache.org/jira/browse/FLINK-32622
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Fang Yong


Currently if user config mini-batch for their sql jobs, flink will always add 
mini-batch assigner operator in job plan even there's no agg/join operators in 
the job. Mini-batch operator will generate useless event and cause performance 
issue for them. If the mini-batch is useless for the specific jobs, flink 
should not add mini-batch assigner even when users turn on mini-batch 
mechanism. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32370) JDBC SQl gateway e2e test is unstable

2023-07-04 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17739896#comment-17739896
 ] 

Fang Yong commented on FLINK-32370:
---

[~Sergey Nuyanzin] Sorry that I checked the shell and found the missing double 
quotation marks, I'll add it

> JDBC SQl gateway e2e test is unstable
> -
>
> Key: FLINK-32370
> URL: https://issues.apache.org/jira/browse/FLINK-32370
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Tests
>Affects Versions: 1.18.0
>Reporter: Chesnay Schepler
>Assignee: Fang Yong
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.18.0
>
> Attachments: flink-vsts-sql-gateway-0-fv-az75-650.log, 
> flink-vsts-standalonesession-0-fv-az75-650.log, 
> flink-vsts-taskexecutor-0-fv-az75-650.log
>
>
> The client is failing while trying to collect data when the job already 
> finished on the cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   >