[jira] [Assigned] (FLINK-33211) Implement table lineage graph
[ https://issues.apache.org/jira/browse/FLINK-33211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33211: - Assignee: Zhenqiu Huang (was: Peter Huang) > Implement table lineage graph > - > > Key: FLINK-33211 > URL: https://issues.apache.org/jira/browse/FLINK-33211 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Zhenqiu Huang >Priority: Major > > Implement table lineage graph -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33211) Implement table lineage graph
[ https://issues.apache.org/jira/browse/FLINK-33211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33211: - Assignee: Peter Huang > Implement table lineage graph > - > > Key: FLINK-33211 > URL: https://issues.apache.org/jira/browse/FLINK-33211 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Peter Huang >Priority: Major > > Implement table lineage graph -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33210) Introduce lineage graph relevant interfaces
[ https://issues.apache.org/jira/browse/FLINK-33210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-33210. - Fix Version/s: 1.20.0 Resolution: Resolved Closed by 6b9c282b5090720a02b941586cb7f5389691dba9 > Introduce lineage graph relevant interfaces > > > Key: FLINK-33210 > URL: https://issues.apache.org/jira/browse/FLINK-33210 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Table SQL / API >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > Fix For: 1.20.0 > > > Introduce LineageGraph, LineageVertex and LineageEdge interfaces -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34460) Jdbc driver get rid of flink-core
[ https://issues.apache.org/jira/browse/FLINK-34460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-34460: -- Description: Currently jdbc driver depends on flink-core/flink-runtime modules, users need to upgrade jdbc driver version when the flink session cluster for olap is upgraded, this is not very suitable. (was: Currently jdbc driver depends on flink-core/flink-runtime module, users needs to upgrade jdbc driver version when the flink session cluster for olap is upgraded, this is not very suitable.) > Jdbc driver get rid of flink-core > - > > Key: FLINK-34460 > URL: https://issues.apache.org/jira/browse/FLINK-34460 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.20.0 >Reporter: Fang Yong >Priority: Major > > Currently jdbc driver depends on flink-core/flink-runtime modules, users need > to upgrade jdbc driver version when the flink session cluster for olap is > upgraded, this is not very suitable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34466) Implement Lineage Interface in Kafka Connector
[ https://issues.apache.org/jira/browse/FLINK-34466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-34466: - Assignee: Zhenqiu Huang > Implement Lineage Interface in Kafka Connector > -- > > Key: FLINK-34466 > URL: https://issues.apache.org/jira/browse/FLINK-34466 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka >Affects Versions: 1.19.0 >Reporter: Zhenqiu Huang >Assignee: Zhenqiu Huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34468) Implement Lineage Interface in Cassandra Connector
[ https://issues.apache.org/jira/browse/FLINK-34468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-34468: - Assignee: Zhenqiu Huang > Implement Lineage Interface in Cassandra Connector > -- > > Key: FLINK-34468 > URL: https://issues.apache.org/jira/browse/FLINK-34468 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Cassandra >Reporter: Zhenqiu Huang >Assignee: Zhenqiu Huang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34467) Implement Lineage Interface in Jdbc Connector
[ https://issues.apache.org/jira/browse/FLINK-34467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-34467: - Assignee: Zhenqiu Huang > Implement Lineage Interface in Jdbc Connector > - > > Key: FLINK-34467 > URL: https://issues.apache.org/jira/browse/FLINK-34467 > Project: Flink > Issue Type: Sub-task > Components: Connectors / JDBC >Affects Versions: 1.19.0 >Reporter: Zhenqiu Huang >Assignee: Zhenqiu Huang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34460) Jdbc driver get rid of flink-core
Fang Yong created FLINK-34460: - Summary: Jdbc driver get rid of flink-core Key: FLINK-34460 URL: https://issues.apache.org/jira/browse/FLINK-34460 Project: Flink Issue Type: Improvement Components: Table SQL / Client Affects Versions: 1.20.0 Reporter: Fang Yong Currently jdbc driver depends on flink-core/flink-runtime module, users needs to upgrade jdbc driver version when the flink session cluster for olap is upgraded, this is not very suitable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815486#comment-17815486 ] Fang Yong commented on FLINK-34239: --- [~mallikarjuna] DONE > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Assignee: Kumar Mallikarjuna >Priority: Major > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34239) Introduce a deep copy method of SerializerConfig for merging with Table configs in org.apache.flink.table.catalog.DataTypeFactoryImpl
[ https://issues.apache.org/jira/browse/FLINK-34239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-34239: - Assignee: Kumar Mallikarjuna > Introduce a deep copy method of SerializerConfig for merging with Table > configs in org.apache.flink.table.catalog.DataTypeFactoryImpl > -- > > Key: FLINK-34239 > URL: https://issues.apache.org/jira/browse/FLINK-34239 > Project: Flink > Issue Type: Sub-task > Components: API / Core >Affects Versions: 1.19.0 >Reporter: Zhanghao Chen >Assignee: Kumar Mallikarjuna >Priority: Major > > *Problem* > Currently, > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig > will create a deep-copy of the SerializerConfig and merge Table config into > it. However, the deep copy is done by manully calling the getter and setter > methods of SerializerConfig, and is prone to human errors, e.g. missing > copying a newly added field in SerializerConfig. > *Proposal* > Introduce a deep copy method for SerializerConfig and replace the curr impl > in > org.apache.flink.table.catalog.DataTypeFactoryImpl#createSerializerExecutionConfig. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34402) Class loading conflicts when using PowerMock in ITcase.
[ https://issues.apache.org/jira/browse/FLINK-34402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-34402: - Assignee: yisha zhou > Class loading conflicts when using PowerMock in ITcase. > > > Key: FLINK-34402 > URL: https://issues.apache.org/jira/browse/FLINK-34402 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.19.0 >Reporter: yisha zhou >Assignee: yisha zhou >Priority: Major > Fix For: 1.19.0 > > > Currently when no user jars exist, system classLoader will be used to load > classes as default. However, if we use powerMock to create some ITCases, the > framework will utilize JavassistMockClassLoader to load classes. Forcing the > use of the system classLoader can lead to class loading conflict issue. > Therefore we should use Thread.currentThread().getContextClassLoader() > instead of > ClassLoader.getSystemClassLoader() here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813003#comment-17813003 ] Fang Yong commented on FLINK-31275: --- Hi [~mobuchowski] [~ZhenqiuHuang], I feel that there are many details that need to be discussed regarding the interfaces. Would it be convenient for you to send an email to me (zjur...@gmail.com) so that we can schedule an offline meeting to discuss it clearly first, and then initiate a new discussion in the dev mailing list. Hope to hear from you, thanks :) > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34090) Introduce SerializerConfig for serialization
Fang Yong created FLINK-34090: - Summary: Introduce SerializerConfig for serialization Key: FLINK-34090 URL: https://issues.apache.org/jira/browse/FLINK-34090 Project: Flink Issue Type: Sub-task Components: API / Type Serialization System Affects Versions: 1.19.0 Reporter: Fang Yong -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34037) FLIP-398: Improve Serialization Configuration And Usage In Flink
Fang Yong created FLINK-34037: - Summary: FLIP-398: Improve Serialization Configuration And Usage In Flink Key: FLINK-34037 URL: https://issues.apache.org/jira/browse/FLINK-34037 Project: Flink Issue Type: Improvement Components: API / Type Serialization System, Runtime / Configuration Affects Versions: 1.19.0 Reporter: Fang Yong Improve serialization in https://cwiki.apache.org/confluence/display/FLINK/FLIP-398%3A+Improve+Serialization+Configuration+And+Usage+In+Flink -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33682) Reuse source operator input records/bytes metrics for SourceOperatorStreamTask
[ https://issues.apache.org/jira/browse/FLINK-33682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33682: - Assignee: Zhanghao Chen > Reuse source operator input records/bytes metrics for SourceOperatorStreamTask > -- > > Key: FLINK-33682 > URL: https://issues.apache.org/jira/browse/FLINK-33682 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Metrics >Reporter: Zhanghao Chen >Assignee: Zhanghao Chen >Priority: Major > > For SourceOperatorStreamTask, source opeartor is the head operator that takes > input. We can directly reuse source operator input records/bytes metrics for > it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789890#comment-17789890 ] Fang Yong commented on FLINK-31275: --- [~mobuchowski] Sorry for the late reply. What does the `DataSet` in `LineageVertex` use for? Why a `LineageVertex` have multiple inputs or outputs? We hope that 'LineageVertex' describes a single source or sink, rather than multiple. So I prefer to add `Map` to `LineageVertex` directly to describe the particular information. We introduce `LineageEdge` in this FLIP to describe the relation between sources and sinks instead of add `input` or `output` in `LineageVertex`. > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33626) Wrong style in flink ui
Fang Yong created FLINK-33626: - Summary: Wrong style in flink ui Key: FLINK-33626 URL: https://issues.apache.org/jira/browse/FLINK-33626 Project: Flink Issue Type: Bug Components: Travis Affects Versions: 1.19.0 Reporter: Fang Yong Attachments: image-2023-11-23-16-06-44-000.png https://nightlies.apache.org/flink/flink-docs-master/ !image-2023-11-23-16-06-44-000.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785782#comment-17785782 ] Fang Yong commented on FLINK-31275: --- [~mobuchowski] Sorry for the late reply, I think I get your point now and thank you for your valuable suggestions. Based on our discussion, I think we can add facets in the `LineageVertex` as follows. {code:java} public interface LineageVertex { /* Facets for the lineage vertex to describe the information of source/sink. */ Map facets; /* Config for the lineage vertex contains all the options for the connector. */ Map config; } {code} We can implement some common facets such as `ConnectorFacet`, `StreamingFacet`, `SchemaFacet`. We can add new `Facet` implementations as needed in the future. For `TableLineageVertex` we can create facets from config in table ddl which will make the table options meaningful. For datastream jobs we can create facts from data source and sink. {code:java} /** Connector facet has name and address for the connector, for example, jdbc and url for jdbc connector, kafka and server for kafka connector. */ public interface ConnectorFacet extends Facet { String name(); String address(); } /** Fact for streaming connector. */ public interface StreamingFacet { String topic(); String group(); String format(); String start(); } /** Fact for schema in connector. */ public interface SchemaFacet { Map fields(); } {code} What do you think of this? Thanks > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783147#comment-17783147 ] Fang Yong edited comment on FLINK-31275 at 11/6/23 9:17 AM: Hi [~mobuchowski] Thanks for your reply. I think our ideas are consistent, just at different levels of abstraction. The interface `LineageVertex` is the top interface for connectors in Flink, and we implement `TableLineageVertex` for tables, because a Table is a complete definition, including the database, schema, etc. We put the options in the `with` into a map, which is consistent with the definition and usage habits of SQL in Flink. For the official Flink connectors, we will implement the `LineageVertex` for `Source` and `InputFormat` for `DataStream` jobs, such as `KafkaSourceLineage`, etc, as we mentioned in FLINK: `We will implement LineageVertexProvider for the builtin source and sink such as KafkaSource , HiveSource , FlinkKafkaProducerBase and etc.`. End-users don't need to implement them. In order to be consistent with the usage habits of tables, we will put the corresponding information into a map when implementing it, and users can obtain it. So, I think our current point of divergence is which level of abstraction the user needs to perceive. In the current FLIP, for DataStream jobs, listener developers need to identify whether the `LineageVertex` is a `KafkaSourceLineageVertex` or a `JdbcLineageVertex`. You mean we need to define another layer, such as the `DataSetConfig` interface, and then the listener developer can identify whether it is a `KafkaDataSetConfig` or a `JdbcDataSetConfig`, right? Our current use of `LineageVertex` is mainly to consider flexibility and facilitate the addition of returned information in the lineage vertex of the `DataStream`, such as the vector type data source information mentioned in the FLIP example. At the same time, connector maintainers can also easily provide lineage vertex for customized connectors. If the connector is in table format, we prefer that users directly provide a TableLineageVertex instance. was (Author: zjureel): Hi [~mobuchowski] Thanks for your reply. I think our ideas are consistent, just at different levels of abstraction. The interface `LineageVertex` is the top interface for connectors in Flink, and we implement `TableLineageVertex` for tables, because a Table is a complete definition, including the database, schema, etc. We put the options in the `with` into a map, which is consistent with the definition and usage habits of SQL in Flink. For the official Flink connectors, we will implement the `LineageVertex` for `Source` and `InputFormat` for `DataStream` jobs, such as `KafkaSourceLineage`, etc, as we mentioned in FLINK: `We will implement LineageVertexProvider for the builtin source and sink such as KafkaSource , HiveSource , FlinkKafkaProducerBase and etc.`. End-users don't need to implement them. In order to be consistent with the usage habits of tables, we will put the corresponding information into a map when implementing it, and users can obtain it. So, I think our current point of divergence is which level of abstraction the user needs to perceive. In the current FLIP, for DataStream jobs, listener developers need to identify whether the `LineageVertex` is a `KafkaSourceLineageVertex` or a `JdbcLineageVertex`. You mean we need to define another layer, such as the `DataSetConfig` interface, and then the listener developer can identify whether it is a `KafkaDataSetConfig` or a `JdbcDataSetConfig`, right? Our current use of `LineageVertexis` mainly to consider flexibility and facilitate the addition of returned information in the lineage vertex of the `DataStream`, such as the vector type data source information mentioned in the FLIP example. At the same time, connector maintainers can also easily provide lineage vertex for customized connectors. If the connector is in table format, we prefer that users directly provide a TableLineageVertex instance. > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783147#comment-17783147 ] Fang Yong commented on FLINK-31275: --- Hi [~mobuchowski] Thanks for your reply. I think our ideas are consistent, just at different levels of abstraction. The interface `LineageVertex` is the top interface for connectors in Flink, and we implement `TableLineageVertex` for tables, because a Table is a complete definition, including the database, schema, etc. We put the options in the `with` into a map, which is consistent with the definition and usage habits of SQL in Flink. For the official Flink connectors, we will implement the `LineageVertex` for `Source` and `InputFormat` for `DataStream` jobs, such as `KafkaSourceLineage`, etc, as we mentioned in FLINK: `We will implement LineageVertexProvider for the builtin source and sink such as KafkaSource , HiveSource , FlinkKafkaProducerBase and etc.`. End-users don't need to implement them. In order to be consistent with the usage habits of tables, we will put the corresponding information into a map when implementing it, and users can obtain it. So, I think our current point of divergence is which level of abstraction the user needs to perceive. In the current FLIP, for DataStream jobs, listener developers need to identify whether the `LineageVertex` is a `KafkaSourceLineageVertex` or a `JdbcLineageVertex`. You mean we need to define another layer, such as the `DataSetConfig` interface, and then the listener developer can identify whether it is a `KafkaDataSetConfig` or a `JdbcDataSetConfig`, right? Our current use of `LineageVertexis` mainly to consider flexibility and facilitate the addition of returned information in the lineage vertex of the `DataStream`, such as the vector type data source information mentioned in the FLIP example. At the same time, connector maintainers can also easily provide lineage vertex for customized connectors. If the connector is in table format, we prefer that users directly provide a TableLineageVertex instance. > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782461#comment-17782461 ] Fang Yong commented on FLINK-31275: --- [~ZhenqiuHuang] Sounds nice to me, I think we can consider this together. If there are some common connectors and relevant schema for the `DataStream` jobs, we can define these lineage vertex in flink, such as `ColumnsLineageVertex` which has multiple columns with different data type? > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781613#comment-17781613 ] Fang Yong edited comment on FLINK-31275 at 11/1/23 9:04 AM: Hi [~mobuchowski], thanks for your comments. In the currently FLIP the `LineageVertex` is the top interface for vertexes in lineage graph, it will be used in flink sql jobs and datastream jobs. 1. For table connectors in sql jobs, there will be `TableLineageVertex` which is generated from flink catalog based table and provides catalog context, table schema for specified connector. The table lineage vertex and edge implementations will be created from dynamic tables for connectors in flink, and they will be updated when the connectors are updated. 2. For customized source/sink in datastream jobs, we can get source and slink `LineageVertex` implementations from `LineageVertexProvider`. When users implement customized lineage vertex and edge, they need to update them when their connectors are updated. IIUC, do you mean we should give an implementation of `LineageVertex` for datastream jobs and users can provide source/sink information there just like `TableLinageVertex` in sql jobs? Then listeners can use the datastream lineage vertex which is similar with table lineage vertex? Due to the flexibility of the source and sink in `DataStream`, we think it's hard to cover all of them, so we just provide `LineageVertex` and `LineageVertexProvider` for them. So we left this flexibility to users and listeners. If a custom connector is a table in `DataStream` job, users can return `TableLineageVertex` in the `LineageVertexProvider`. And for the following `LineageVertex` {code:java} public interface LineageVertex { /* Config for the lineage vertex contains all the options for the connector. */ Map config(); /* List of datasets that are consumed by this job */ List inputs(); /* List of datasets that are produced by this job */ List outputs(); } {code} We tend to provide independent edge descriptions of connectors in `LineageEdge` for lineage graph instead of adding dataset in `LineageVertex`. The `LineageVertex` here is the `DataSet` you mentioned. WDYT? Hope to hear from you, thanks was (Author: zjureel): Hi [~mobuchowski], thanks for your comments. In the currently FLIP the `LineageVertex` is the top interface for vertexes in lineage graph, it will be used in flink sql jobs and datastream jobs. 1. For table connectors in sql jobs, there will be `TableLineageVertex` which is generated from flink catalog based table and provides catalog context, table schema for specified connector. The table lineage vertex and edge implementations will be created from dynamic tables for connectors in flink, and they will be updated when the connectors are updated. 2. For customized source/sink in datastream jobs, we can get source and slink `LineageVertex` implementations from `LineageVertexProvider`. When users implement customized lineage vertex and edge, they need to update them when their connectors are updated. IIUC, do you mean we should give an implementation of `LineageVertex` for datastream jobs and users can provide source/sink information there just like `TableLinageVertex` in sql jobs? Then listeners can use the datastream lineage vertex which is similar with table lineage vertex? Due to the flexibility of the source and sink in `DataStream`, we think it's hard to cover all of them, so we just provide `LineageVertex` and `LineageVertexProvider` for them. So we left this flexibility to users and listeners. If a custom connector is a table in `DataStream` job, users can return `TableLineageVertex` in the `LineageVertexProvider`. And for the following `LineageVertex` ``` public interface LineageVertex { /* Config for the lineage vertex contains all the options for the connector. */ Map config(); /* List of datasets that are consumed by this job */ List inputs(); /* List of datasets that are produced by this job */ List outputs(); } ``` We tend to provide independent edge descriptions of connectors in `LineageEdge` for lineage graph instead of adding dataset in `LineageVertex`. The `LineageVertex` here is the `DataSet` you mentioned. WDYT? Hope to hear from you, thanks > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.o
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17781613#comment-17781613 ] Fang Yong commented on FLINK-31275: --- Hi [~mobuchowski], thanks for your comments. In the currently FLIP the `LineageVertex` is the top interface for vertexes in lineage graph, it will be used in flink sql jobs and datastream jobs. 1. For table connectors in sql jobs, there will be `TableLineageVertex` which is generated from flink catalog based table and provides catalog context, table schema for specified connector. The table lineage vertex and edge implementations will be created from dynamic tables for connectors in flink, and they will be updated when the connectors are updated. 2. For customized source/sink in datastream jobs, we can get source and slink `LineageVertex` implementations from `LineageVertexProvider`. When users implement customized lineage vertex and edge, they need to update them when their connectors are updated. IIUC, do you mean we should give an implementation of `LineageVertex` for datastream jobs and users can provide source/sink information there just like `TableLinageVertex` in sql jobs? Then listeners can use the datastream lineage vertex which is similar with table lineage vertex? Due to the flexibility of the source and sink in `DataStream`, we think it's hard to cover all of them, so we just provide `LineageVertex` and `LineageVertexProvider` for them. So we left this flexibility to users and listeners. If a custom connector is a table in `DataStream` job, users can return `TableLineageVertex` in the `LineageVertexProvider`. And for the following `LineageVertex` ``` public interface LineageVertex { /* Config for the lineage vertex contains all the options for the connector. */ Map config(); /* List of datasets that are consumed by this job */ List inputs(); /* List of datasets that are produced by this job */ List outputs(); } ``` We tend to provide independent edge descriptions of connectors in `LineageEdge` for lineage graph instead of adding dataset in `LineageVertex`. The `LineageVertex` here is the `DataSet` you mentioned. WDYT? Hope to hear from you, thanks > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778890#comment-17778890 ] Fang Yong commented on FLINK-31275: --- [~ZhenqiuHuang] Sorry for the late reply, and please feel free to comment the issues if you have any idea or would like to take it, thanks > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33210) Introduce lineage graph relevant interfaces
[ https://issues.apache.org/jira/browse/FLINK-33210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33210: - Assignee: Fang Yong > Introduce lineage graph relevant interfaces > > > Key: FLINK-33210 > URL: https://issues.apache.org/jira/browse/FLINK-33210 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Table SQL / API >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > Introduce LineageGraph, LineageVertex and LineageEdge interfaces -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33166) Support setting root logger level by config
[ https://issues.apache.org/jira/browse/FLINK-33166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-33166. - Resolution: Fixed Fixed by b14411e7062c235cb28582a37e37decc7c425476 > Support setting root logger level by config > --- > > Key: FLINK-33166 > URL: https://issues.apache.org/jira/browse/FLINK-33166 > Project: Flink > Issue Type: Improvement > Components: Runtime / Configuration >Reporter: Zhanghao Chen >Assignee: Zhanghao Chen >Priority: Major > Labels: pull-request-available > > Users currently cannot change logging level by config and have to modify the > cumbersome logger configuration file manually. We'd better provide a shortcut > and support setting root logger level by config. There're a number configs > already to set logger configurations, like {{env.log.dir}} for logging dir, > {{env.log.max}} for max number of old logging file to save. We can name the > new config {{{}env.log.level{}}}. > Multiple loggers are configured in the configuration files, some with > different logging level from the root logger to reduce irrelevant logs. In > most cases, only the root logger is relevant. We can make {{env.log.level}} > applies to the root logger only for simplicity. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33221) Add config options for administrator JVM options
[ https://issues.apache.org/jira/browse/FLINK-33221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33221: - Assignee: Zhanghao Chen > Add config options for administrator JVM options > > > Key: FLINK-33221 > URL: https://issues.apache.org/jira/browse/FLINK-33221 > Project: Flink > Issue Type: Improvement > Components: Runtime / Configuration >Reporter: Zhanghao Chen >Assignee: Zhanghao Chen >Priority: Major > > We encounter similar issues described in SPARK-23472. Users may need to add > JVM options to their Flink applications (e.g. to tune GC options). They > typically use {{env.java.opts.x}} series of options to do so. We also have a > set of administrator JVM options to apply by default, e.g. to enable GC log, > tune GC options, etc. Both use cases will need to set the same series of > options and will clobber one another. > In the past, we generated and pretended to the administrator JVM options in > the Java code for generating the starting command for JM/TM. However, this > has been proven to be difficult to maintain. > Therefore, I propose to also add a set of default JVM options for > administrator use that prepends the user-set extra JVM options. We can mark > the existing {{env.java.opts.x}} series as user-set extra JVM options and add > a set of new {{env.java.opts.x.default}} options for administrator use. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32309) Shared classpaths and jars manager for jobs in sql gateway cause confliction
[ https://issues.apache.org/jira/browse/FLINK-32309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-32309: - Assignee: Fang Yong (was: FangYong) > Shared classpaths and jars manager for jobs in sql gateway cause confliction > > > Key: FLINK-32309 > URL: https://issues.apache.org/jira/browse/FLINK-32309 > Project: Flink > Issue Type: Bug > Components: Table SQL / Gateway >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available, stale-assigned > > Current all jobs in the same session of sql gateway will share the resource > manager which provide the classpath for jobs. After a job is performed, it's > classpath and jars will be in the shared resource manager which are used by > the next jobs. It may cause too many unnecessary jars in a job or even cause > confliction -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33205) Replace Akka with Pekko in the description of "pekko.ssl.enabled"
[ https://issues.apache.org/jira/browse/FLINK-33205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-33205. - Assignee: Zhanghao Chen Resolution: Fixed Fixed by d652238e84037a25eb0cef137f34edb37d9e1ac2 > Replace Akka with Pekko in the description of "pekko.ssl.enabled" > - > > Key: FLINK-33205 > URL: https://issues.apache.org/jira/browse/FLINK-33205 > Project: Flink > Issue Type: Technical Debt > Components: Runtime / Configuration >Affects Versions: 1.18.0 >Reporter: Zhanghao Chen >Assignee: Zhanghao Chen >Priority: Minor > Labels: pull-request-available > Fix For: 1.19.0 > > > Current description: "Turns on SSL for Akka’s remote communication. This is > applicable only when the global ssl flag security.ssl.enabled is set to > true." "Akka" should be replaced with Pekko". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33212) Introduce job status changed listener for lineage
Fang Yong created FLINK-33212: - Summary: Introduce job status changed listener for lineage Key: FLINK-33212 URL: https://issues.apache.org/jira/browse/FLINK-33212 Project: Flink Issue Type: Sub-task Components: Runtime / REST Affects Versions: 1.19.0 Reporter: Fang Yong Introduce job status changed listener relevant interfaces -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33211) Implement table lineage graph
Fang Yong created FLINK-33211: - Summary: Implement table lineage graph Key: FLINK-33211 URL: https://issues.apache.org/jira/browse/FLINK-33211 Project: Flink Issue Type: Sub-task Components: Table SQL / Runtime Affects Versions: 1.19.0 Reporter: Fang Yong Implement table lineage graph -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33210) Introduce lineage graph relevant interfaces
Fang Yong created FLINK-33210: - Summary: Introduce lineage graph relevant interfaces Key: FLINK-33210 URL: https://issues.apache.org/jira/browse/FLINK-33210 Project: Flink Issue Type: Sub-task Components: API / DataStream, Table SQL / API Affects Versions: 1.19.0 Reporter: Fang Yong Introduce LineageGraph, LineageVertex and LineageEdge interfaces -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-31275: - Assignee: Fang Yong > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-31275: -- Description: FLIP-314 has been accepted https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener (was: Currently flink generates job id in `JobGraph` which can identify a job. On the other hand, flink create source/sink table in planner. We need to create relations between source and sink tables for the job with an identifier id) > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Priority: Major > > FLIP-314 has been accepted > https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship
[ https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772734#comment-17772734 ] Fang Yong commented on FLINK-31275: --- [~ZhenqiuHuang] Thanks for your attention, FLIP-314 is on voting https://lists.apache.org/thread/dxdqjc0dd8rf1vbdg755zo1n2zj1tj8d > Flink supports reporting and storage of source/sink tables relationship > --- > > Key: FLINK-31275 > URL: https://issues.apache.org/jira/browse/FLINK-31275 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.18.0 >Reporter: Fang Yong >Priority: Major > > Currently flink generates job id in `JobGraph` which can identify a job. On > the other hand, flink create source/sink table in planner. We need to create > relations between source and sink tables for the job with an identifier id -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33166) Support setting root logger level by config
[ https://issues.apache.org/jira/browse/FLINK-33166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33166: - Assignee: Zhanghao Chen > Support setting root logger level by config > --- > > Key: FLINK-33166 > URL: https://issues.apache.org/jira/browse/FLINK-33166 > Project: Flink > Issue Type: Improvement > Components: Runtime / Configuration >Reporter: Zhanghao Chen >Assignee: Zhanghao Chen >Priority: Major > > Users currently cannot change logging level by config and have to modify the > cumbersome logger configuration file manually. We'd better provide a shortcut > and support setting root logger level by config. There're a number configs > already to set logger configurations, like {{env.log.dir}} for logging dir, > {{env.log.max}} for max number of old logging file to save. We can name the > new config {{{}env.log.level{}}}. > Multiple loggers are configured in the configuration files, some with > different logging level from the root logger to reduce irrelevant logs. In > most cases, only the root logger is relevant. We can make {{env.log.level}} > applies to the root logger only for simplicity. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32405) Initialize catalog listener for CatalogManager
[ https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32405. - Resolution: Duplicate Duplicate with FLINK-32404 > Initialize catalog listener for CatalogManager > -- > > Key: FLINK-32405 > URL: https://issues.apache.org/jira/browse/FLINK-32405 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: FangYong >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32405) Initialize catalog listener for CatalogManager
[ https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767354#comment-17767354 ] Fang Yong commented on FLINK-32405: --- OK, it is a duplicate issue with FLINK-32404 > Initialize catalog listener for CatalogManager > -- > > Key: FLINK-32405 > URL: https://issues.apache.org/jira/browse/FLINK-32405 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: FangYong >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (FLINK-32405) Initialize catalog listener for CatalogManager
[ https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reopened FLINK-32405: --- > Initialize catalog listener for CatalogManager > -- > > Key: FLINK-32405 > URL: https://issues.apache.org/jira/browse/FLINK-32405 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: FangYong >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32848) [JUnit5 Migration] The persistence, query, registration, rpc and shuffle packages of flink-runtime module
[ https://issues.apache.org/jira/browse/FLINK-32848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32848. - Resolution: Fixed Fixed by 2ced46b6ed4ebfb20f9b392cba4e789e16e4da7d...f2cb1d247283344e9194e63931a2948e09f73c93 > [JUnit5 Migration] The persistence, query, registration, rpc and shuffle > packages of flink-runtime module > - > > Key: FLINK-32848 > URL: https://issues.apache.org/jira/browse/FLINK-32848 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: Rui Fan >Assignee: Zhanghao Chen >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33033) Add haservice micro benchmark for olap
[ https://issues.apache.org/jira/browse/FLINK-33033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33033: - Assignee: Fang Yong > Add haservice micro benchmark for olap > -- > > Key: FLINK-33033 > URL: https://issues.apache.org/jira/browse/FLINK-33033 > Project: Flink > Issue Type: Sub-task > Components: Benchmarks >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > Add micro benchmarks of haservice for olap to improve the performance for > short-lived jobs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33051) GlobalFailureHandler interface should be retired in favor of LabeledGlobalFailureHandler
[ https://issues.apache.org/jira/browse/FLINK-33051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-33051: - Assignee: Matt Wang > GlobalFailureHandler interface should be retired in favor of > LabeledGlobalFailureHandler > > > Key: FLINK-33051 > URL: https://issues.apache.org/jira/browse/FLINK-33051 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Panagiotis Garefalakis >Assignee: Matt Wang >Priority: Minor > > FLIP-304 introduced `LabeledGlobalFailureHandler` interface that is an > extension of `GlobalFailureHandler` interface. The later can thus be removed > in the future to avoid the existence of interfaces with duplicate functions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32396) Support timestamp for jdbc driver and gateway
[ https://issues.apache.org/jira/browse/FLINK-32396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32396. - Fix Version/s: 1.19.0 Resolution: Fixed Fixed by c33a6527d8383dc571e0b648b8a29322416ab9d6 > Support timestamp for jdbc driver and gateway > - > > Key: FLINK-32396 > URL: https://issues.apache.org/jira/browse/FLINK-32396 > Project: Flink > Issue Type: Improvement > Components: Table SQL / JDBC >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Minor > Labels: auto-deprioritized-major, pull-request-available > Fix For: 1.19.0 > > > Support timestamp and timestamp_ltz data type for jdbc driver and sql-gateway -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33033) Add haservice micro benchmark for olap
Fang Yong created FLINK-33033: - Summary: Add haservice micro benchmark for olap Key: FLINK-33033 URL: https://issues.apache.org/jira/browse/FLINK-33033 Project: Flink Issue Type: Sub-task Components: Benchmarks Affects Versions: 1.19.0 Reporter: Fang Yong Add micro benchmarks of haservice for olap to improve the performance for short-lived jobs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-25356) Add benchmarks for performance in OLAP scenarios
[ https://issues.apache.org/jira/browse/FLINK-25356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-25356: -- Parent: (was: FLINK-25318) Issue Type: Technical Debt (was: Sub-task) > Add benchmarks for performance in OLAP scenarios > > > Key: FLINK-25356 > URL: https://issues.apache.org/jira/browse/FLINK-25356 > Project: Flink > Issue Type: Technical Debt > Components: Benchmarks >Reporter: Xintong Song >Priority: Major > > As discussed in FLINK-25318, we would need a unified, public visible > benchmark setups, for supporting OLAP performance improvements and > investigations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17761807#comment-17761807 ] Fang Yong commented on FLINK-32667: --- [~mapohl] As we discussed in the threads https://lists.apache.org/thread/2wj1wfzcg162534v8olqt18y2x9x99od and https://lists.apache.org/thread/szdr4ngrfcmo7zko4917393zbqhgw0v5, we have came to an agreement about flink olap and [~jark] has already added the relevant content in flink roadmap https://flink.apache.org/roadmap/ , so I think we can continue to promote olap related issues. Regarding this issue, I would like to synchronize our current progress. We do a simple e2e test internally and we found that the latency of the simplest query with HA is 6.5 times higher than that without HA. I think we can add this olap micro benchmarks in flink-benchmarks for HA service and then continue this issue. After that, we can add a micro benchmark for the new interfaces to compare to the previous one. What do you think? Thanks > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP
[ https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32755: -- Fix Version/s: 1.19.0 > Add quick start guide for Flink OLAP > > > Key: FLINK-32755 > URL: https://issues.apache.org/jira/browse/FLINK-32755 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: xiangyu feng >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > I propose to add a new {{QUICKSTART.md}} guide that provides instructions for > beginner to build a production ready Flink OLAP Service by using > flink-jdbc-driver, flink-sql-gateway and flink session cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32755) Add quick start guide for Flink OLAP
[ https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32755. - Resolution: Fixed Fixed by fbef3c22757a2352145599487beb84e02aaeb389 > Add quick start guide for Flink OLAP > > > Key: FLINK-32755 > URL: https://issues.apache.org/jira/browse/FLINK-32755 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: xiangyu feng >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available > > I propose to add a new {{QUICKSTART.md}} guide that provides instructions for > beginner to build a production ready Flink OLAP Service by using > flink-jdbc-driver, flink-sql-gateway and flink session cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32749) Sql gateway supports default catalog loaded by CatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32749. - Fix Version/s: 1.19.0 Resolution: Fixed Fixed by 1797a70c0074b9e0cbfd20abc2c33050a7906578 > Sql gateway supports default catalog loaded by CatalogStore > --- > > Key: FLINK-32749 > URL: https://issues.apache.org/jira/browse/FLINK-32749 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Gateway >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > Currently sql gateway will create memory catalog as default catalog, it > should support default catalog loaded by catalog store -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP
[ https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32755: -- Parent: FLINK-32898 Issue Type: Sub-task (was: Improvement) > Add quick start guide for Flink OLAP > > > Key: FLINK-32755 > URL: https://issues.apache.org/jira/browse/FLINK-32755 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: xiangyu feng >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available > > I propose to add a new {{QUICKSTART.md}} guide that provides instructions for > beginner to build a production ready Flink OLAP Service by using > flink-jdbc-driver, flink-sql-gateway and flink session cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP
[ https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32755: -- Parent: (was: FLINK-32898) Issue Type: Improvement (was: Sub-task) > Add quick start guide for Flink OLAP > > > Key: FLINK-32755 > URL: https://issues.apache.org/jira/browse/FLINK-32755 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: xiangyu feng >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available > > I propose to add a new {{QUICKSTART.md}} guide that provides instructions for > beginner to build a production ready Flink OLAP Service by using > flink-jdbc-driver, flink-sql-gateway and flink session cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP
[ https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32755: -- Parent: FLINK-32898 Issue Type: Sub-task (was: Improvement) > Add quick start guide for Flink OLAP > > > Key: FLINK-32755 > URL: https://issues.apache.org/jira/browse/FLINK-32755 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: xiangyu feng >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available > > I propose to add a new {{QUICKSTART.md}} guide that provides instructions for > beginner to build a production ready Flink OLAP Service by using > flink-jdbc-driver, flink-sql-gateway and flink session cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32755) Add quick start guide for Flink OLAP
[ https://issues.apache.org/jira/browse/FLINK-32755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32755: -- Parent: (was: FLINK-25318) Issue Type: Improvement (was: Sub-task) > Add quick start guide for Flink OLAP > > > Key: FLINK-32755 > URL: https://issues.apache.org/jira/browse/FLINK-32755 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: xiangyu feng >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available > > I propose to add a new {{QUICKSTART.md}} guide that provides instructions for > beginner to build a production ready Flink OLAP Service by using > flink-jdbc-driver, flink-sql-gateway and flink session cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32968) Fix doc for customized catalog listener
[ https://issues.apache.org/jira/browse/FLINK-32968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32968. - Fix Version/s: 1.18.0 1.19.0 Resolution: Fixed Fixed by d62feeca45290da610aa8aa1790e80f57397cd6d > Fix doc for customized catalog listener > --- > > Key: FLINK-32968 > URL: https://issues.apache.org/jira/browse/FLINK-32968 > Project: Flink > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.18.0, 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0, 1.19.0 > > > Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32396) Support timestamp for jdbc driver and gateway
[ https://issues.apache.org/jira/browse/FLINK-32396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-32396: - Assignee: Fang Yong > Support timestamp for jdbc driver and gateway > - > > Key: FLINK-32396 > URL: https://issues.apache.org/jira/browse/FLINK-32396 > Project: Flink > Issue Type: Improvement > Components: Table SQL / JDBC >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available, stale-major > > Support timestamp and timestamp_ltz data type for jdbc driver and sql-gateway -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32749) Sql gateway supports default catalog loaded by CatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-32749: - Assignee: Fang Yong > Sql gateway supports default catalog loaded by CatalogStore > --- > > Key: FLINK-32749 > URL: https://issues.apache.org/jira/browse/FLINK-32749 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Gateway >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Currently sql gateway will create memory catalog as default catalog, it > should support default catalog loaded by catalog store -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32968) Fix doc for customized catalog listener
[ https://issues.apache.org/jira/browse/FLINK-32968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-32968: - Assignee: Fang Yong > Fix doc for customized catalog listener > --- > > Key: FLINK-32968 > URL: https://issues.apache.org/jira/browse/FLINK-32968 > Project: Flink > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.18.0, 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32968) Fix doc for customized catalog listener
[ https://issues.apache.org/jira/browse/FLINK-32968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32968: -- Summary: Fix doc for customized catalog listener (was: Update doc for customized catalog listener) > Fix doc for customized catalog listener > --- > > Key: FLINK-32968 > URL: https://issues.apache.org/jira/browse/FLINK-32968 > Project: Flink > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.18.0, 1.19.0 >Reporter: Fang Yong >Priority: Major > > Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32968) Update doc for customized catalog listener
Fang Yong created FLINK-32968: - Summary: Update doc for customized catalog listener Key: FLINK-32968 URL: https://issues.apache.org/jira/browse/FLINK-32968 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.18.0, 1.19.0 Reporter: Fang Yong Refer to https://issues.apache.org/jira/browse/FLINK-32798 for more details -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener
[ https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759072#comment-17759072 ] Fang Yong commented on FLINK-32798: --- [~ruanhang1993] Thanks for your suggestions and I will create an issue for the docs cc [~renqs] > Release Testing: Verify FLIP-294: Support Customized Catalog Modification > Listener > -- > > Key: FLINK-32798 > URL: https://issues.apache.org/jira/browse/FLINK-32798 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Assignee: Hang Ruan >Priority: Major > Fix For: 1.18.0 > > Attachments: result.png, sqls.png, test.png > > > The document about catalog modification listener is: > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener
[ https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758019#comment-17758019 ] Fang Yong commented on FLINK-32798: --- [~ruanhang1993] I checked the implementation in sql-gateway and there's an issue in the document https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener that users can not set the listener with `SET` for sql-gateway. Users should add the option `table.catalog-modification.listeners` in flink-conf.yaml or use it as dynamic parameter for sql-gateway. How can I create a pr for this? cc [~renqs] > Release Testing: Verify FLIP-294: Support Customized Catalog Modification > Listener > -- > > Key: FLINK-32798 > URL: https://issues.apache.org/jira/browse/FLINK-32798 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Priority: Major > Fix For: 1.18.0 > > Attachments: test.png > > > The document about catalog modification listener is: > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32794) Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway
[ https://issues.apache.org/jira/browse/FLINK-32794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32794: -- Description: Document for jdbc driver: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/jdbcdriver/ > Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway > - > > Key: FLINK-32794 > URL: https://issues.apache.org/jira/browse/FLINK-32794 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Priority: Major > Fix For: 1.18.0 > > > Document for jdbc driver: > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/jdbcdriver/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32794) Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway
[ https://issues.apache.org/jira/browse/FLINK-32794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757760#comment-17757760 ] Fang Yong commented on FLINK-32794: --- Thanks [~renqs], DONE > Release Testing: Verify FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway > - > > Key: FLINK-32794 > URL: https://issues.apache.org/jira/browse/FLINK-32794 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Priority: Major > Fix For: 1.18.0 > > > Document for jdbc driver: > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/jdbcdriver/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener
[ https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757759#comment-17757759 ] Fang Yong commented on FLINK-32798: --- [~renqs] I have add the document link in the "Description" > Release Testing: Verify FLIP-294: Support Customized Catalog Modification > Listener > -- > > Key: FLINK-32798 > URL: https://issues.apache.org/jira/browse/FLINK-32798 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Priority: Major > Fix For: 1.18.0 > > > The document about catalog modification listener is: > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32798) Release Testing: Verify FLIP-294: Support Customized Catalog Modification Listener
[ https://issues.apache.org/jira/browse/FLINK-32798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32798: -- Description: The document about catalog modification listener is: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener > Release Testing: Verify FLIP-294: Support Customized Catalog Modification > Listener > -- > > Key: FLINK-32798 > URL: https://issues.apache.org/jira/browse/FLINK-32798 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.18.0 >Reporter: Qingsheng Ren >Priority: Major > Fix For: 1.18.0 > > > The document about catalog modification listener is: > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/catalogs/#catalog-modification-listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32653) Add doc for catalog store
[ https://issues.apache.org/jira/browse/FLINK-32653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32653. - Fix Version/s: 1.19.0 Resolution: Fixed Thanks [~hackergin], fixed by 5628c7097875be4bd56fc7805dbdd727d92bdac7 > Add doc for catalog store > - > > Key: FLINK-32653 > URL: https://issues.apache.org/jira/browse/FLINK-32653 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: Feng Jin >Assignee: Feng Jin >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-25015) job name should not always be `collect` submitted by sql client
[ https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-25015: - Assignee: xiangyu feng (was: KevinyhZou) > job name should not always be `collect` submitted by sql client > --- > > Key: FLINK-25015 > URL: https://issues.apache.org/jira/browse/FLINK-25015 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Client >Affects Versions: 1.14.0 >Reporter: KevinyhZou >Assignee: xiangyu feng >Priority: Major > Labels: pull-request-available, stale-assigned > Attachments: image-2021-11-23-20-15-32-459.png, > image-2021-11-23-20-16-21-932.png > > > I use flink sql client to submitted different sql query to flink session > cluster, and the sql job name is always `collect`, as below > !image-2021-11-23-20-16-21-932.png! > which make no sence to users. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-31259) Gateway supports initialization of catalog at startup
[ https://issues.apache.org/jira/browse/FLINK-31259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-31259. - Resolution: Fixed See https://issues.apache.org/jira/browse/FLINK-32427 > Gateway supports initialization of catalog at startup > - > > Key: FLINK-31259 > URL: https://issues.apache.org/jira/browse/FLINK-31259 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Gateway >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available, stale-assigned > > Support to initializing catalogs in gateway when it starts -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32660) Support external file systems in FileCatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32660. - Resolution: Fixed Thanks [~ferenc-csaky], fixed with f68967a7c938f6258125c3221be74522965d1ff2 > Support external file systems in FileCatalogStore > - > > Key: FLINK-32660 > URL: https://issues.apache.org/jira/browse/FLINK-32660 > Project: Flink > Issue Type: Sub-task >Reporter: Ferenc Csaky >Assignee: Ferenc Csaky >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753091#comment-17753091 ] Fang Yong commented on FLINK-32667: --- [~mapohl] Thanks for your detailed explanation and I caught it. Currently Flink will create `FailoverStrategy` for jobs according to the cluster configuration and as I mentioned above there are only two strategies: full and region. I think decoupling `LeaderElection` and `job-related HA data` from `HighAvailabilityServices` is a very good solution, that's what we want for OLAP queries. As you mentioned in [FLINK-31816|https://issues.apache.org/jira/browse/FLINK-31816?focusedCommentId=17741054&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17741054]: `HighAvailabilityServices could get a single implementation that requires a factory method for creating JobGraphStore, JobResultStore, CheckpointRecoveryFactory, and BlobStore. Additionally, it would require a LeaderElectionService (which is essentially a factory for LeaderElection instances)` I think we can do it now and after that we can add a new failover strategy such as `none` for cluster and create embedding factory. What do you think? [~mapohl][~chesnay] > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753091#comment-17753091 ] Fang Yong edited comment on FLINK-32667 at 8/11/23 8:00 AM: [~mapohl] Thanks for your detailed explanation and I caught it. Currently Flink will create `FailoverStrategy` for jobs according to the cluster configuration and as I mentioned above there are only two strategies: full and region. I think decoupling `LeaderElection` and `job-related HA data` from `HighAvailabilityServices` is a very good solution, that's what we want for OLAP queries. As you mentioned in [FLINK-31816|https://issues.apache.org/jira/browse/FLINK-31816?focusedCommentId=17741054&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17741054]: ``` HighAvailabilityServices could get a single implementation that requires a factory method for creating JobGraphStore, JobResultStore, CheckpointRecoveryFactory, and BlobStore. Additionally, it would require a LeaderElectionService (which is essentially a factory for LeaderElection instances) ``` I think we can do it now and after that we can add a new failover strategy such as `none` for cluster and create embedding factory. What do you think? [~mapohl][~chesnay] was (Author: zjureel): [~mapohl] Thanks for your detailed explanation and I caught it. Currently Flink will create `FailoverStrategy` for jobs according to the cluster configuration and as I mentioned above there are only two strategies: full and region. I think decoupling `LeaderElection` and `job-related HA data` from `HighAvailabilityServices` is a very good solution, that's what we want for OLAP queries. As you mentioned in [FLINK-31816|https://issues.apache.org/jira/browse/FLINK-31816?focusedCommentId=17741054&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17741054]: `HighAvailabilityServices could get a single implementation that requires a factory method for creating JobGraphStore, JobResultStore, CheckpointRecoveryFactory, and BlobStore. Additionally, it would require a LeaderElectionService (which is essentially a factory for LeaderElection instances)` I think we can do it now and after that we can add a new failover strategy such as `none` for cluster and create embedding factory. What do you think? [~mapohl][~chesnay] > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32747) Fix ddl operations for catalog from CatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752610#comment-17752610 ] Fang Yong commented on FLINK-32747: --- Fixed with 76554d186ad198384ff783e3e3f040dd738b7571 > Fix ddl operations for catalog from CatalogStore > > > Key: FLINK-32747 > URL: https://issues.apache.org/jira/browse/FLINK-32747 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Fix ddl operations for catalog which loaded by CatalogStore -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32747) Fix ddl operations for catalog from CatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32747. - Fix Version/s: 1.18.0 Resolution: Fixed > Fix ddl operations for catalog from CatalogStore > > > Key: FLINK-32747 > URL: https://issues.apache.org/jira/browse/FLINK-32747 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > Fix ddl operations for catalog which loaded by CatalogStore -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752214#comment-17752214 ] Fang Yong commented on FLINK-32667: --- [~chesnay] In addition to the above solution, I think we can add a new failover strategy. Currently there are two failover strategy: full and region, we can add a new strategy such as `none` for the jobs and they don't need to store relevant information, WDYT? > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752036#comment-17752036 ] Fang Yong commented on FLINK-32667: --- [~chesnay] [~mapohl] Currently ha service consists of two parts Part1: Flink cluster will register its dispatcher address, rest port to ha service such as zk or configmap for k8s, then the client can get these information and submit job to dispatcher via rest, this is needed in OLAP scenario Part2: Dispatcher will validate, save and recover job from JobGraphStore and JobResultStore requires failover. In OLAP scenario, Part2 stores the jobs to external systems (zk/s3) which may cause latency jitter in olap queries, so we only need Part1 not Par2. My original idea was to add an option for ha service and it can use embedded JobGraphStore and memory JobResultStore, it may be the more suitable solution. But it will cause the job will not recover even if it is configured with failover when JM crashes. The second idea is that we do not store jobs without failover to ha service, but currently we can only check this according to restart strategy and also the issues mentioned by [~chesnay] above. What do you think of this, do you think the first idea (add option for ha service) is feasible? Looking forward to your feedback, thanks > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32747) Fix ddl operations for catalog from CatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32747: -- Description: Fix ddl operations for catalog which loaded by CatalogStore (was: Support ddl for catalog which loaded by CatalogStore) > Fix ddl operations for catalog from CatalogStore > > > Key: FLINK-32747 > URL: https://issues.apache.org/jira/browse/FLINK-32747 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Fix ddl operations for catalog which loaded by CatalogStore -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-32747) Fix ddl operations for catalog from CatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-32747: -- Summary: Fix ddl operations for catalog from CatalogStore (was: Support ddl for catalog from CatalogStore) > Fix ddl operations for catalog from CatalogStore > > > Key: FLINK-32747 > URL: https://issues.apache.org/jira/browse/FLINK-32747 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Support ddl for catalog which loaded by CatalogStore -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751092#comment-17751092 ] Fang Yong commented on FLINK-32667: --- [~mapohl] I think performance testing is a good suggestion, we also discussed and created FLINK-25356 at the beginning. I will consider how to add e2e benchmark in project flink-benchmarks and I think we should to add micro benchmarks for primary issue. Unfortunately, there is currently no ML about the ML for short-living jobs in flink. We only discussed this with [~xtsong] off-line when we created FLINK-25318, but I strongly agree with you that we need to initiate broader discussion in the community dev ML. I will collect our practical experiences and initiate a discussion in ML later, thank you very much for your valuable suggestion! > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32749) Sql gateway supports default catalog loaded by CatalogStore
Fang Yong created FLINK-32749: - Summary: Sql gateway supports default catalog loaded by CatalogStore Key: FLINK-32749 URL: https://issues.apache.org/jira/browse/FLINK-32749 Project: Flink Issue Type: Improvement Components: Table SQL / Gateway Affects Versions: 1.19.0 Reporter: Fang Yong Currently sql gateway will create memory catalog as default catalog, it should support default catalog loaded by catalog store -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750938#comment-17750938 ] Fang Yong commented on FLINK-32667: --- [~mapohl] We have an internally e2e test for olap queries to statistical latency and QPS. However, e2e testing requires some resources. I'm considering how to build this test in the community, and whether each important issue can be tested in flink-benchmarks project. What do you think of it? Thanks > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32747) Support ddl for catalog from CatalogStore
[ https://issues.apache.org/jira/browse/FLINK-32747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-32747: - Assignee: Fang Yong > Support ddl for catalog from CatalogStore > - > > Key: FLINK-32747 > URL: https://issues.apache.org/jira/browse/FLINK-32747 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.19.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Support ddl for catalog which loaded by CatalogStore -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32747) Support ddl for catalog from CatalogStore
Fang Yong created FLINK-32747: - Summary: Support ddl for catalog from CatalogStore Key: FLINK-32747 URL: https://issues.apache.org/jira/browse/FLINK-32747 Project: Flink Issue Type: Sub-task Components: Table SQL / API Affects Versions: 1.19.0 Reporter: Fang Yong Support ddl for catalog which loaded by CatalogStore -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
[ https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-32667: - Assignee: Fang Yong > Use standalone store and embedding writer for jobs with no-restart-strategy > in session cluster > -- > > Key: FLINK-32667 > URL: https://issues.apache.org/jira/browse/FLINK-32667 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > When a flink session cluster use zk or k8s high availability service, it will > store jobs in zk or ConfigMap. When we submit flink olap jobs to the session > cluster, they always turn off restart strategy. These jobs with > no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25015) job name should not always be `collect` submitted by sql client
[ https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747827#comment-17747827 ] Fang Yong commented on FLINK-25015: --- Hi [~zouyunhe], are you still working on this issue? I found this issue is created years ago. If you are not, we will fix it, thanks. > job name should not always be `collect` submitted by sql client > --- > > Key: FLINK-25015 > URL: https://issues.apache.org/jira/browse/FLINK-25015 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Client >Affects Versions: 1.14.0 >Reporter: KevinyhZou >Assignee: KevinyhZou >Priority: Major > Labels: pull-request-available, stale-assigned > Attachments: image-2021-11-23-20-15-32-459.png, > image-2021-11-23-20-16-21-932.png > > > I use flink sql client to submitted different sql query to flink session > cluster, and the sql job name is always `collect`, as below > !image-2021-11-23-20-16-21-932.png! > which make no sence to users. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-25015) job name should not always be `collect` submitted by sql client
[ https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong updated FLINK-25015: -- Parent: FLINK-25318 Issue Type: Sub-task (was: Improvement) > job name should not always be `collect` submitted by sql client > --- > > Key: FLINK-25015 > URL: https://issues.apache.org/jira/browse/FLINK-25015 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Client >Affects Versions: 1.14.0 >Reporter: KevinyhZou >Assignee: KevinyhZou >Priority: Major > Labels: pull-request-available, stale-assigned > Attachments: image-2021-11-23-20-15-32-459.png, > image-2021-11-23-20-16-21-932.png > > > I use flink sql client to submitted different sql query to flink session > cluster, and the sql job name is always `collect`, as below > !image-2021-11-23-20-16-21-932.png! > which make no sence to users. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-25015) job name should not always be `collect` submitted by sql client
[ https://issues.apache.org/jira/browse/FLINK-25015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747763#comment-17747763 ] Fang Yong commented on FLINK-25015: --- [~zouyunhe] What's the progress of this issue, there're conflicts in the PR > job name should not always be `collect` submitted by sql client > --- > > Key: FLINK-25015 > URL: https://issues.apache.org/jira/browse/FLINK-25015 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Client >Affects Versions: 1.14.0 >Reporter: KevinyhZou >Assignee: KevinyhZou >Priority: Major > Labels: pull-request-available, stale-assigned > Attachments: image-2021-11-23-20-15-32-459.png, > image-2021-11-23-20-16-21-932.png > > > I use flink sql client to submitted different sql query to flink session > cluster, and the sql job name is always `collect`, as below > !image-2021-11-23-20-16-21-932.png! > which make no sence to users. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32698) Add getCheckpointOptions interface in ManagedSnapshotContext
[ https://issues.apache.org/jira/browse/FLINK-32698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747737#comment-17747737 ] Fang Yong commented on FLINK-32698: --- [~Ming Li] `ManagedSnapshotContext` is a `public` interface and there should be a flip if you want to add something in it. On the other hand, I found the sink operator in `Paimon` is a `OneInputStreamOperator` which is also `public` in flink and it has `snapshotState` method which has `snapshotType` in `CheckpointOptions`, maybe we can consider it first. > Add getCheckpointOptions interface in ManagedSnapshotContext > > > Key: FLINK-32698 > URL: https://issues.apache.org/jira/browse/FLINK-32698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Reporter: Ming Li >Priority: Major > > Currently only {{checkpointID}} and {{checkpointTimestamp}} information are > provided in {{{}ManagedSnapshotContext{}}}. We hope to provide more > information about {{{}CheckpointOptions{}}}, so that operators can adopt > different logics when performing {{{}SnapshotState{}}}. > > An example is to adopt different behaviors according to the type of > checkpoint. For example, in {{{}Paimon{}}}, we hope that the paimon‘s > snapshot written by {{checkpoint}} can expire automatically, while the > paimon‘s snapshot written by {{savepoint}} can be persisted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32402) FLIP-294: Support Customized Catalog Modification Listener
[ https://issues.apache.org/jira/browse/FLINK-32402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747719#comment-17747719 ] Fang Yong commented on FLINK-32402: --- [~knaufk] Currently I think it's better to add this in `catalogs.md` and users can get all information about catalogs in that page, I noticed that some advanced usages of catalog are already in `cagalogs.md` such as `User-Defined Catalog`. What do you think? > FLIP-294: Support Customized Catalog Modification Listener > -- > > Key: FLINK-32402 > URL: https://issues.apache.org/jira/browse/FLINK-32402 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Ecosystem >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > Issue for > https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32676) Add doc for catalog modification listener
[ https://issues.apache.org/jira/browse/FLINK-32676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong reassigned FLINK-32676: - Assignee: Fang Yong > Add doc for catalog modification listener > - > > Key: FLINK-32676 > URL: https://issues.apache.org/jira/browse/FLINK-32676 > Project: Flink > Issue Type: Improvement > Components: Documentation, Table SQL / Runtime >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > Labels: pull-request-available > > Add doc for catalog modification listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32676) Add doc for catalog modification listener
Fang Yong created FLINK-32676: - Summary: Add doc for catalog modification listener Key: FLINK-32676 URL: https://issues.apache.org/jira/browse/FLINK-32676 Project: Flink Issue Type: Improvement Components: Documentation, Table SQL / Runtime Affects Versions: 1.18.0 Reporter: Fang Yong Add doc for catalog modification listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32402) FLIP-294: Support Customized Catalog Modification Listener
[ https://issues.apache.org/jira/browse/FLINK-32402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17747235#comment-17747235 ] Fang Yong commented on FLINK-32402: --- Thanks [~knaufk], I'll create an issue to add doc for this feature > FLIP-294: Support Customized Catalog Modification Listener > -- > > Key: FLINK-32402 > URL: https://issues.apache.org/jira/browse/FLINK-32402 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Ecosystem >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: Fang Yong >Priority: Major > > Issue for > https://cwiki.apache.org/confluence/display/FLINK/FLIP-294%3A+Support+Customized+Catalog+Modification+Listener -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster
Fang Yong created FLINK-32667: - Summary: Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster Key: FLINK-32667 URL: https://issues.apache.org/jira/browse/FLINK-32667 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Affects Versions: 1.18.0 Reporter: Fang Yong When a flink session cluster use zk or k8s high availability service, it will store jobs in zk or ConfigMap. When we submit flink olap jobs to the session cluster, they always turn off restart strategy. These jobs with no-restart-strategy should not be stored in zk or ConfigMap in k8s -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32665) Support read null value for csv format
Fang Yong created FLINK-32665: - Summary: Support read null value for csv format Key: FLINK-32665 URL: https://issues.apache.org/jira/browse/FLINK-32665 Project: Flink Issue Type: Bug Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.18.0 Reporter: Fang Yong when there is null column in a file with csv format, it will throw exception when flink job try to parse these data -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-29317) Add WebSocket in Dispatcher to support olap query submission and push results in session cluster
[ https://issues.apache.org/jira/browse/FLINK-29317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17746784#comment-17746784 ] Fang Yong commented on FLINK-29317: --- Hi [~dmvk] Sorry for so late rely. Currently we would like to promote improvement about Flink OLAP in the community. Are you still interested in this area? We want to create the FLIP and initiate discussions in the community later, thanks > Add WebSocket in Dispatcher to support olap query submission and push results > in session cluster > > > Key: FLINK-29317 > URL: https://issues.apache.org/jira/browse/FLINK-29317 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Runtime / REST >Affects Versions: 1.14.5, 1.15.3 >Reporter: Fang Yong >Priority: Major > > Currently client submit olap query to flink session cluster via http rest > api, and pull the results through interval polling. The sink task in > TaskManager creates socket server for each query, when the JobManager > receives the pull request from client, it requests query results from the > socket server. The process is as follows > Job submission path: > client -> http rest -> JobManager -> Sink Socket Server > Result acquisition path: > client <- http rest <- JobManager <- Sink Socket Server > This leads to two problems > 1. There will be some performance loss when submitting jobs through http > rest, for example, temporary files will be created for each job > 2. The client pulls the result data at a certain time interval, which is a > fixed cost. The larger interval leads to increase query latency, the smaller > interval will increase the pressure of Dispatcher. > 3. Each sink task initializes a socket server, it will increase the query > latency, on the other hand, it wastes resources. > For the Flink OLAP scenario, we propose to add websocket protocol in session > cluster to support submitting jobs and returning results. The client creates > and manage a connection with websocket server, submits olap query to session > cluster. The TaskManagers create and manage connection to websocket server > too, and sink task sends results to the server in stream. When the JobManager > receives the results from sink task, it pushes the result data to the client > through the connection between them. > We implemented this feature in the internal Flink version of ByteDance. On > average, the latency of each query can be reduced by about 100ms, it's a big > optimization for OLAP queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32643) Introduce off-heap shared state cache across stateful operators in TM
Fang Yong created FLINK-32643: - Summary: Introduce off-heap shared state cache across stateful operators in TM Key: FLINK-32643 URL: https://issues.apache.org/jira/browse/FLINK-32643 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Affects Versions: 1.19.0 Reporter: Fang Yong Currently each stateful operator will create an independent db instance if it uses rocksdb as state backend, and we can configure `state.backend.rocksdb.block.cache-size` for each db to speed up state performance. This parameter defaults to 8M, and we cannot set it too large, such as 512M, this may cause OOM and each DB cannot effectively utilize memory. To address this issue, we would like to introduce off-heap shared state cache across multiple db instances for stateful operators in TM. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-32405) Initialize catalog listener for CatalogManager
[ https://issues.apache.org/jira/browse/FLINK-32405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fang Yong closed FLINK-32405. - Resolution: Fixed > Initialize catalog listener for CatalogManager > -- > > Key: FLINK-32405 > URL: https://issues.apache.org/jira/browse/FLINK-32405 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Affects Versions: 1.18.0 >Reporter: Fang Yong >Assignee: FangYong >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32633) Kubernetes e2e test is not stable
Fang Yong created FLINK-32633: - Summary: Kubernetes e2e test is not stable Key: FLINK-32633 URL: https://issues.apache.org/jira/browse/FLINK-32633 Project: Flink Issue Type: Technical Debt Components: Deployment / Kubernetes, Kubernetes Operator Affects Versions: 1.18.0 Reporter: Fang Yong The output file is: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51444&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=43ba8ce7-ebbf-57cd-9163-444305d74117 Jul 19 17:06:02 Stopping minikube ... Jul 19 17:06:02 * Stopping node "minikube" ... Jul 19 17:06:13 * 1 node stopped. Jul 19 17:06:13 [FAIL] Test script contains errors. Jul 19 17:06:13 Checking for errors... Jul 19 17:06:13 No errors in log files. Jul 19 17:06:13 Checking for exceptions... Jul 19 17:06:13 No exceptions in log files. Jul 19 17:06:13 Checking for non-empty .out files... grep: /home/vsts/work/_temp/debug_files/flink-logs/*.out: No such file or directory Jul 19 17:06:13 No non-empty .out files. Jul 19 17:06:13 Jul 19 17:06:13 [FAIL] 'Run Kubernetes test' failed after 4 minutes and 28 seconds! Test exited with exit code 1 Jul 19 17:06:13 17:06:13 ##[group]Environment Information Jul 19 17:06:13 Jps -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32622) Do not add mini-batch assigner operator if it is useless
[ https://issues.apache.org/jira/browse/FLINK-32622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744215#comment-17744215 ] Fang Yong commented on FLINK-32622: --- cc [~libenchao] [~zoudan] > Do not add mini-batch assigner operator if it is useless > > > Key: FLINK-32622 > URL: https://issues.apache.org/jira/browse/FLINK-32622 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Planner >Affects Versions: 1.19.0 >Reporter: Fang Yong >Priority: Major > > Currently if user config mini-batch for their sql jobs, flink will always add > mini-batch assigner operator in job plan even there's no agg/join operators > in the job. Mini-batch operator will generate useless event and cause > performance issue for them. If the mini-batch is useless for the specific > jobs, flink should not add mini-batch assigner even when users turn on > mini-batch mechanism. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32622) Do not add mini-batch assigner operator if it is useless
Fang Yong created FLINK-32622: - Summary: Do not add mini-batch assigner operator if it is useless Key: FLINK-32622 URL: https://issues.apache.org/jira/browse/FLINK-32622 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Affects Versions: 1.19.0 Reporter: Fang Yong Currently if user config mini-batch for their sql jobs, flink will always add mini-batch assigner operator in job plan even there's no agg/join operators in the job. Mini-batch operator will generate useless event and cause performance issue for them. If the mini-batch is useless for the specific jobs, flink should not add mini-batch assigner even when users turn on mini-batch mechanism. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32370) JDBC SQl gateway e2e test is unstable
[ https://issues.apache.org/jira/browse/FLINK-32370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739896#comment-17739896 ] Fang Yong commented on FLINK-32370: --- [~Sergey Nuyanzin] Sorry that I checked the shell and found the missing double quotation marks, I'll add it > JDBC SQl gateway e2e test is unstable > - > > Key: FLINK-32370 > URL: https://issues.apache.org/jira/browse/FLINK-32370 > Project: Flink > Issue Type: Technical Debt > Components: Tests >Affects Versions: 1.18.0 >Reporter: Chesnay Schepler >Assignee: Fang Yong >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.18.0 > > Attachments: flink-vsts-sql-gateway-0-fv-az75-650.log, > flink-vsts-standalonesession-0-fv-az75-650.log, > flink-vsts-taskexecutor-0-fv-az75-650.log > > > The client is failing while trying to collect data when the job already > finished on the cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)