Are the Table API Connectors production ready?
Hi Team, In Flink 1.16.0, we would like to use some of the Table API Connectors for production. Kindly let me know if the below connectors are production ready or only for testing purposes. | Name | Version | Source | Sink | | Filesystem | | Bounded and Unbounded Scan, Lookup | Streaming Sink, Batch Sink | | Elasticsearch | 6.x & 7.x | Not supported | Streaming Sink, Batch Sink | | Opensearch | 1.x & 2.x | Not supported | Streaming Sink, Batch Sink | | Apache Kafka | 0.10+ | Unbounded Scan | Streaming Sink, Batch Sink | | Amazon DynamoDB | | Not supported | Streaming Sink, Batch Sink | | Amazon Kinesis Data Streams | | Unbounded Scan | Streaming Sink | | Amazon Kinesis Data Firehose | | Not supported | Streaming Sink | | JDBC | | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache HBase | 1.4.x & 2.2.x | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache Hive | Thanks and regards
Re: Are the Table API Connectors production ready?
Hi,Can anyone help me here? Thanks and regards,Ravi On Monday, 27 February, 2023 at 09:33:18 am IST, ravi_suryavanshi.yahoo.com via user wrote: Hi Team, In Flink 1.16.0, we would like to use some of the Table API Connectors for production. Kindly let me know if the below connectors are production ready or only for testing purposes. | Name | Version | Source | Sink | | Filesystem | | Bounded and Unbounded Scan, Lookup | Streaming Sink, Batch Sink | | Elasticsearch | 6.x & 7.x | Not supported | Streaming Sink, Batch Sink | | Opensearch | 1.x & 2.x | Not supported | Streaming Sink, Batch Sink | | Apache Kafka | 0.10+ | Unbounded Scan | Streaming Sink, Batch Sink | | Amazon DynamoDB | | Not supported | Streaming Sink, Batch Sink | | Amazon Kinesis Data Streams | | Unbounded Scan | Streaming Sink | | Amazon Kinesis Data Firehose | | Not supported | Streaming Sink | | JDBC | | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache HBase | 1.4.x & 2.2.x | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache Hive | Thanks and regards
Re: Are the Table API Connectors production ready?
Thanks a lot, Yaroslav and Shammon.I want to use the Filesystem Connector. I tried it works well till it is running. If the job is restarted. It processes all the files again. Could not find the move or delete option after collecting the files. Also, I could not find the filtering using patterns. Pattern matching is required as different files exist in the same folder. Regards,RaviOn Friday, 10 March, 2023 at 05:47:27 am IST, Shammon FY wrote: Hi Ravi Agree with Yaroslav and if you find any problems in use, you can create an issue in jira https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK . I have used kafka/jdbc/hive in production too, they work well. Best,Shammon On Fri, Mar 10, 2023 at 1:42 AM Yaroslav Tkachenko wrote: Hi Ravi, All of them should be production ready. I've personally used half of them in production. Do you have any specific concerns? On Thu, Mar 9, 2023 at 9:39 AM ravi_suryavanshi.yahoo.com via user wrote: Hi,Can anyone help me here? Thanks and regards,Ravi On Monday, 27 February, 2023 at 09:33:18 am IST, ravi_suryavanshi.yahoo.com via user wrote: Hi Team, In Flink 1.16.0, we would like to use some of the Table API Connectors for production. Kindly let me know if the below connectors are production ready or only for testing purposes. | Name | Version | Source | Sink | | Filesystem | | Bounded and Unbounded Scan, Lookup | Streaming Sink, Batch Sink | | Elasticsearch | 6.x & 7.x | Not supported | Streaming Sink, Batch Sink | | Opensearch | 1.x & 2.x | Not supported | Streaming Sink, Batch Sink | | Apache Kafka | 0.10+ | Unbounded Scan | Streaming Sink, Batch Sink | | Amazon DynamoDB | | Not supported | Streaming Sink, Batch Sink | | Amazon Kinesis Data Streams | | Unbounded Scan | Streaming Sink | | Amazon Kinesis Data Firehose | | Not supported | Streaming Sink | | JDBC | | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache HBase | 1.4.x & 2.2.x | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache Hive | Thanks and regards
Re: Are the Table API Connectors production ready?
Thank you All. On Tuesday, 14 March, 2023 at 07:14:05 am IST, yuxia wrote: The plan shows the filters has been pushed down. But remeber, although pused down, the filesystem table won't accept the filter. So, it'll be still like scan all files. Best regards, Yuxia 发件人: "Maryam Moafimadani" 收件人: "Hang Ruan" 抄送: "yuxia" , "ravi suryavanshi" , "Yaroslav Tkachenko" , "Shammon FY" , "User" 发送时间: 星期一, 2023年 3 月 13日 下午 10:07:57 主题: Re: Are the Table API Connectors production ready? Hi All,It's exciting to see file filtering in the plan for development. I am curious whether the following query on a filesystem connector would actually push down the filter on metadata `file.path`? Select score, `file.path` from MyUserTable WHERE `file.path` LIKE '%prefix_%' == Optimized Execution Plan == Calc(select=[score, file.path], where=[LIKE(file.path, '%2022070611284%')]) +- TableSourceScan(table=[[default_catalog, default_database, MyUserTable, filter=[LIKE(file.path, _UTF-16LE'%2022070611284%')]]], fields=[score, file.path]) Thanks,Maryam On Mon, Mar 13, 2023 at 8:55 AM Hang Ruan wrote: Hi, yuxia,I would like to help to complete this task. Best,Hang yuxia 于2023年3月13日周一 09:32写道: Yeah, you're right. We don't provide filtering files with patterns. And actually we had already a jira[1] for it. I was intended to do this in the past, but don't have much time. Anyone who are insterested can take it over. We're happy to help review. [1] https://issues.apache.org/jira/browse/FLINK-17398 Best regards, Yuxia 发件人: "User" 收件人: "Yaroslav Tkachenko" , "Shammon FY" 抄送: "User" 发送时间: 星期一, 2023年 3 月 13日 上午 12:36:46 主题: Re: Are the Table API Connectors production ready? Thanks a lot, Yaroslav and Shammon.I want to use the Filesystem Connector. I tried it works well till it is running. If the job is restarted. It processes all the files again. Could not find the move or delete option after collecting the files. Also, I could not find the filtering using patterns. Pattern matching is required as different files exist in the same folder. Regards,RaviOn Friday, 10 March, 2023 at 05:47:27 am IST, Shammon FY wrote: Hi Ravi Agree with Yaroslav and if you find any problems in use, you can create an issue in jira https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK . I have used kafka/jdbc/hive in production too, they work well. Best,Shammon On Fri, Mar 10, 2023 at 1:42 AM Yaroslav Tkachenko wrote: Hi Ravi, All of them should be production ready. I've personally used half of them in production. Do you have any specific concerns? On Thu, Mar 9, 2023 at 9:39 AM ravi_suryavanshi.yahoo.com via user wrote: Hi,Can anyone help me here? Thanks and regards,Ravi On Monday, 27 February, 2023 at 09:33:18 am IST, ravi_suryavanshi.yahoo.com via user wrote: Hi Team, In Flink 1.16.0, we would like to use some of the Table API Connectors for production. Kindly let me know if the below connectors are production ready or only for testing purposes. | Name | Version | Source | Sink | | Filesystem | | Bounded and Unbounded Scan, Lookup | Streaming Sink, Batch Sink | | Elasticsearch | 6.x & 7.x | Not supported | Streaming Sink, Batch Sink | | Opensearch | 1.x & 2.x | Not supported | Streaming Sink, Batch Sink | | Apache Kafka | 0.10+ | Unbounded Scan | Streaming Sink, Batch Sink | | Amazon DynamoDB | | Not supported | Streaming Sink, Batch Sink | | Amazon Kinesis Data Streams | | Unbounded Scan | Streaming Sink | | Amazon Kinesis Data Firehose | | Not supported | Streaming Sink | | JDBC | | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache HBase | 1.4.x & 2.2.x | Bounded Scan, Lookup | Streaming Sink, Batch Sink | | Apache Hive | Thanks and regards -- Maryam MoafimadaniSenior Data Developer @Shopify
Table API function and expression vs SQL
Hello Team,Need your advice on which method is recommended considering don't want to change my query code when the Flink is updated/upgraded to the higher version. Here I am seeking advice for writing the SQL using java code(Table API function and Expression) or using pure SQL. I am assuming that SQL will not have any impact if upgraded to the higher version. Thanks and Regards,Ravi
Re: Table API function and expression vs SQL
Thanks a lot Hand and Mate On Saturday, 25 March, 2023 at 06:21:49 pm IST, Mate Czagany wrote: Hi, Please also keep in mind that restoring existing Table API jobs from savepoints when upgrading to a newer minor version of Flink, e.g. 1.16 -> 1.17 is not supported as the topology might change between these versions due to optimizer changes. See here for more information: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#stateful-upgrades-and-evolution Regards,Mate Hang Ruan ezt írta (időpont: 2023. márc. 25., Szo, 13:38): Hi, I think the SQL job is better. Flink SQL jobs can be easily shared with others for debugging. And it is more suitable for flow batch integration.For a small part of jobs which can not be expressed through SQL, we will choose a job by DataStream API. Best,Hang ravi_suryavanshi.yahoo.com via user 于2023年3月24日周五 17:25写道: Hello Team,Need your advice on which method is recommended considering don't want to change my query code when the Flink is updated/upgraded to the higher version. Here I am seeking advice for writing the SQL using java code(Table API function and Expression) or using pure SQL. I am assuming that SQL will not have any impact if upgraded to the higher version. Thanks and Regards,Ravi
Re: Table API function and expression vs SQL
Hi,we have decided to use the Table API using Flink SQL syntax (NOT JAVA). Can SQL syntax be changed in the higher version?as per the doc "SQL support is based on Apache Calcite which implements the SQL standard." Thanks & Regards,RaviOn Saturday, 25 March, 2023 at 06:21:49 pm IST, Mate Czagany wrote: Hi, Please also keep in mind that restoring existing Table API jobs from savepoints when upgrading to a newer minor version of Flink, e.g. 1.16 -> 1.17 is not supported as the topology might change between these versions due to optimizer changes. See here for more information: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#stateful-upgrades-and-evolution Regards,Mate Hang Ruan ezt írta (időpont: 2023. márc. 25., Szo, 13:38): Hi, I think the SQL job is better. Flink SQL jobs can be easily shared with others for debugging. And it is more suitable for flow batch integration.For a small part of jobs which can not be expressed through SQL, we will choose a job by DataStream API. Best,Hang ravi_suryavanshi.yahoo.com via user 于2023年3月24日周五 17:25写道: Hello Team,Need your advice on which method is recommended considering don't want to change my query code when the Flink is updated/upgraded to the higher version. Here I am seeking advice for writing the SQL using java code(Table API function and Expression) or using pure SQL. I am assuming that SQL will not have any impact if upgraded to the higher version. Thanks and Regards,Ravi
Table API table2datastream (toChangelogStream)
Hi,I am trying to use the Table API which will convert the Table data into Datastream. API is StreamTableEnvironment.toChangelogStream(Table table).I have noticed that its parallelism is always single i.e. One (1). How can set more than one? If it is intended to execute with a single thread then is there any impact on scalability and performance? Thanks and Regards,Ravi