??????Fiink-sql????????????????????????????

2022-02-21 Thread ?Y??????????????????
?? flink-table-runtime-blink # org.apache.flink.table.runtime.functions.SqlFunctionUtils ??demo ---- ??: "??"

Fiink-sql的官方函数的代码哪里可以看到

2022-02-21 Thread 王宇航
Hi: 经常会用到flink-sql的内置函数,因为官方函数比自己写的UDF更加健壮,想学习下官方函数是怎么写的,请问在哪一块能看到这个代码呢?

Re:hive 进行 overwrite 合并数据后文件变大?

2022-02-21 Thread 周瑞
是不是数据重复了,如果是ORC格式可以尝试执行alter table table_name partition (pt_dt='2021-02-20') concatenate 语句进行小文件的合并。 --Original-- From: "RS"; Date: 2022年2月22日(星期二) 上午9:36 To: "user-zh"; Subject: hive 进行 overwrite 合并数据后文件变大? Hi,

Re: [Table API] [JDBC CDC] Caching Configuration from MySql instance needed in multiple Flink Jobs

2022-02-21 Thread Leonard Xu
Hello, Dan > 2022年2月21日 下午9:11,Dan Serb 写道: > 1.Have a processor that uses Flink JDBC CDC Connector over the table that > stores the information I need. (This is implemented currently - working) You mean you’ve implemented a Flink JDBC Connector? Maybe the Flink CDC Connectors[1] would help

hive 进行 overwrite 合并数据后文件变大?

2022-02-21 Thread RS
Hi, flink写hive任务,checkpoint周期配置的比较短,生成了很多小文件,一天一个目录, 然后我调用flink sql合并之前的数据,跑完之后,发现存储变大了,请教下这个是什么原因导致的? 合并之前是很多小part文件,overwrite之后文件减少了,但是存储变大了,从274MB变大成2.9GB了? hive表table1的分区字段是`date` insert overwrite aw_topic_compact select * from `table1` where `date`='2022-02-21'; 合并前: 514.0 M 1.5 G

Re: Cannot upgrade helm chart

2022-02-21 Thread Austin Cawley-Edwards
Hey Marco, There’s unfortunately no perfect fit here, at least that I know of. A Deployment will make it possible to upgrade the image, but does not support container exits (eg if the Flink job completes, even successfully, K8s will still restart the container). If you are only running long lived

Re: Pulsar connector 2.9.1 failing job submission in standalone.

2022-02-21 Thread Ananth Gundabattula
Thanks a lot Yufei and Wong. I was able to get a version working by combining both the aspects mentioned in each of your responses. 1. Trying the sample code base that Wong mentioned below resulted in a no-response from JobManager. I had to use the non-sql connector jar in my python

Cannot upgrade helm chart

2022-02-21 Thread marco andreas
Hello flink community, I am deploying a flink application cluster using a helm chart , the problem is that the jobmanager component type is a "Job" , and with helm i can't do an upgrade of the chart in order to change the application image version because helm is unable to upgrade the docker

Re: Apache Flink - Continuously streaming data using jdbc connector

2022-02-21 Thread M Singh
Thanks Guovei and Francis for your references.    On Monday, February 21, 2022, 01:05:58 AM EST, Guowei Ma wrote: Hi, You can try flink's cdc connector [1] to see if it meets your needs. [1] https://github.com/ververica/flink-cdc-connectors Best,Guowei On Mon, Feb 21, 2022 at 6:23

[Table API] [JDBC CDC] Caching Configuration from MySql instance needed in multiple Flink Jobs

2022-02-21 Thread Dan Serb
Hello all, I kind of need the community’s help with some ideas, as I’m quite new with Flink and I feel like I need a little bit of guidance in regard to an implementation I’m working on. What I need to do, is to have a way to store a mysql table in Flink, and expose that data to other jobs,

Re: Pulsar connector 2.9.1 failing job submission in standalone.

2022-02-21 Thread Luning Wong
Luning Wong 于2022年2月21日周一 19:38写道: > import logging > import sys > > from pyflink.common import SimpleStringSchema, WatermarkStrategy > from pyflink.datastream import StreamExecutionEnvironment > from pyflink.datastream.connectors import PulsarSource, > PulsarDeserializationSchema,

CSV join in batch mode

2022-02-21 Thread Killian GUIHEUX
Hello all, I have to perform a join between two large csv sets that do not fit in ram. I process this two files in batch mode. I also need a side output to catch csv processing errors. So my question is what is the best way to this kind of join operation ? I think I should use a valueState

Re: Pulsar connector 2.9.1 failing job submission in standalone.

2022-02-21 Thread Yufei Zhang
Hi Ananth, >From the steps you described, the steps involved using `flink-sql-connector-pulsar-1.15-SNAPSHOT.jar`, however to my knowledge pulsar connector has not supported Table API yet, so would you mind considering using the `flink-connector-pulsar-1.14.jar` (without sql, though the classes

Re: Pulsar connector 2.9.1 failing job submission in standalone.

2022-02-21 Thread Ananth Gundabattula
Thanks Guowei. A small correction in the telnet result command below. I had a typo in the telnet command earlier (did not separate the port from host name ). Issuing the proper telnet command resolved the jobmanagers host properly. Regards, Ananth From: Guowei Ma Date: Monday, 21 February

????

2022-02-21 Thread Allen

Re: Pulsar connector 2.9.1 failing job submission in standalone.

2022-02-21 Thread Guowei Ma
Thanks Ananth for your clarification.But I am not an expert on Pulsar. I would cc the author of the connector to have a look. Would Yufei like to give some insight? Best, Guowei On Mon, Feb 21, 2022 at 2:10 PM Ananth Gundabattula < agundabatt...@darwinium.com> wrote: > Thanks for the response

????

2022-02-21 Thread Blake

退订

2022-02-21 Thread 王翔
退订

Re: DataStream API: Parquet File Format with Scala Case Classes

2022-02-21 Thread Fabian Paul
Hi Ryan, Thanks for bringing up this topic. Currently, your analysis is correct, and reading parquet files outside the Table API is rather difficult. The community started an effort in Flink 1.15 to restructure some of the formats to make them better applicable to the DataStream and Table API.