Re: Flink restart strategy on specific exception

2020-05-14 Thread Zhu Zhu
Ticket FLINK-17714 is created to track this requirement. Thanks, Zhu Zhu Till Rohrmann 于2020年5月13日周三 下午8:30写道: > Yes, you are right Zhu Zhu. Extending > the RestartBackoffTimeStrategyFactoryLoader to also load custom > RestartBackoffTimeStrategies sound like a good improvement for the future.

Flink suggestions;

2020-05-14 Thread Aissa Elaffani
Hello Guys, I am a beginner in this field of real-time streaming and i am working with apache flink, and i ignore a lot of features of it, and actually I am building an application, in which i receive some sensors data in this format {"status": "Alerte", "classe": " ", "value": {"temperature":

回复:flink 历史数据join

2020-05-14 Thread jimandlice
就是要用api的方式来继承 不是直接操作sql那样来出来 | | jimandlice | | 邮箱:jimandl...@163.com | Signature is customized by Netease Mail Master 在2020年05月15日 11:38,jimandlice 写道: api 做 还是用table sql 来做 谁做比较好集成 因为都要用来join 之后数据写入 hdfs 当中 因为刚刚接手 有很多不太明白 望给予帮助 谢谢 | | jimandlice | | 邮箱:jimandl...@163.com | Signature

回复:flink 历史数据join

2020-05-14 Thread jimandlice
api 做 还是用table sql 来做 谁做比较好集成 因为都要用来join 之后数据写入 hdfs 当中 因为刚刚接手 有很多不太明白 望给予帮助 谢谢 | | jimandlice | | 邮箱:jimandl...@163.com | Signature is customized by Netease Mail Master 在2020年05月15日 11:34,Benchao Li 写道: 看起来就是一个异构数据源join的需求吧。 可以直接用Flink SQL尝试一下。Flink SQL现在有batch读取Hbase、Mysql的能力,也有写入Hive的能力。

Re: flink 历史数据join

2020-05-14 Thread Benchao Li
看起来就是一个异构数据源join的需求吧。 可以直接用Flink SQL尝试一下。Flink SQL现在有batch读取Hbase、Mysql的能力,也有写入Hive的能力。 jimandlice 于2020年5月15日周五 上午11:16写道: > 先工作上有一个需求 2个数据源 一个是mysql 一个是Hbase 2者上 有很多历史数据 这2个数据源上 已经没有数据写入了 都是历史数据 > 现在要把这2个数据源的某两张张表 进行join 生成之后的数据 存在放在hdfs上 导入到hive上去现在就是不知道 > 是用datatream还是dataset 没有一个很好的 解决方案

flink 历史数据join

2020-05-14 Thread jimandlice
先工作上有一个需求 2个数据源 一个是mysql 一个是Hbase 2者上 有很多历史数据 这2个数据源上 已经没有数据写入了 都是历史数据 现在要把这2个数据源的某两张张表 进行join 生成之后的数据 存在放在hdfs上 导入到hive上去现在就是不知道 是用datatream还是dataset 没有一个很好的 解决方案 望给与回复 | | jimandlice | | 邮箱:jimandl...@163.com | Signature is customized by Netease Mail Master

Flink performance tuning on operators

2020-05-14 Thread Ivan Yang
Hi, We have a Flink job that reads data from an input stream, then converts each event from JSON string Avro object, finally writes to parquet files using StreamingFileSink with OnCheckPointRollingPolicy of 5 mins. Basically a stateless job. Initially, we use one map operator to convert Json

Re: save point容灾方案咨询

2020-05-14 Thread LakeShen
Hi , 你可以把你的场景在描述的详细一些。 Best, LakeShen 请叫我雷锋 <854194...@qq.com> 于2020年5月14日周四 下午9:42写道: > 各位大佬好,请问有啥好的save point容灾方案嘛? > > > > 发自我的iPhone

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-05-14 Thread aj
Hi Yang, I am able to resolve the issue by removing Hadoop dependency as you mentioned. 1. Removed hadoop-common dependency and org.apache.flink flink-streaming-java_${scala.binary.version} ${flink.version} org.apache.flink

How to read UUID out of a JDBC table source

2020-05-14 Thread Bonino Dario
Dear list, I need to use a Table Source to extract data from a PostgreSQL table that includes a column of type uuid. Data in the column is converted to java.util.UUID by the postgresql jdbc driver (I guess) however I was not able to find a way to define a Table schema for correctly reading

Protection against huge values in RocksDB List State

2020-05-14 Thread Robin Cassan
Hi all! I cannot seem to find any setting to limit the number of records appended in a RocksDBListState that is used when we use SessionWindows with a ProcessFunction. It seems that, for each incoming element, the new element will be appended to the value with the RocksDB `merge` operator,

Re: Watermarks and parallelism

2020-05-14 Thread Alexander Fedulov
Hi Gnana, 1. No, watermarks are generated independently per subtask. I think this section of the docs might make things more clear - [1] . 2. The same watermark from

save point容灾方案咨询

2020-05-14 Thread 请叫我雷锋
各位大佬好,请问有啥好的save point容灾方案嘛? 发自我的iPhone

[Announce] Flink Forward Global 2020 - Call for Proposals

2020-05-14 Thread Seth Wiesman
Hi Everyone! After a successful Virtual Flink Forward in April, we have decided to present our October edition in the same way. In these uncertain times, we are conscious of everyone's health and safety and want to make sure our events are accessible for everyone. Flink Forward Global

Re: ProducerRecord with Kafka Sink for 1.8.0

2020-05-14 Thread Nick Bendtner
Hi Gary, I have used this technique before. I deleted flink-avro jar from lib and packed it into the application jar and there are no problems. Best, Nick On Thu, May 14, 2020 at 6:11 AM Gary Yao wrote: > Its because the flink distribution of the cluster is 1.7.2. We use a >> standalone

Re: Statefun 2.0 questions

2020-05-14 Thread Igal Shilman
Hi, I'm glad things are getting clearer, looking forward to seeing how statefun is working out for you :-) To change the parallelism you can simply set the "parallelism.default" [1] key in flink-conf.yaml. It is located in the statefun container at /opt/flink/conf/flink-conf.yaml. To avoid

Flink Weekly | 每周社区动态更新 - 2020/05/14

2020-05-14 Thread forideal
大家好,本文为 Flink Weekly 的第十五期,由张成整理、李本超 Review。主要内容包括:近期社区开发进展,邮件问题答疑以及 Flink 最新社区动态及技术文章推荐。 社区开发进展 Release [releases] Flink 1.10.1 正式发布。 http://apache-flink.147419.n8.nabble.com/ANNOUNCE-Apache-Flink-1-10-1-released-td3054.html [releases] Tzu-Li 发起了 Flink Stateful Functions Release 2.0.0

Re: ProducerRecord with Kafka Sink for 1.8.0

2020-05-14 Thread Gary Yao
> > Its because the flink distribution of the cluster is 1.7.2. We use a > standalone cluster , so in the lib directory in flink the artifact is > flink-core-1.7.2.jar . I need to pack flink-core-1.9.0.jar from application > and use child first class loading to use newer version of flink-core. >

Re: How To subscribe a Kinesis Stream using enhance fanout?

2020-05-14 Thread Tzu-Li (Gordon) Tai
Hi Xiaolong, You are right, the way the Kinesis connector is implemented / the way the AWS APIs are used, does not allow it to consume Kinesis streams with enhanced fan-out enabled consumers [1]. Could you open a JIRA ticket for this? As far as I can tell, this could be a valuable contribution to

Re: Flink operator throttle

2020-05-14 Thread Benchao Li
AFAIK, `FlinkKafkaConsumer010#setRateLimiter` can configure the kafka source to have a rate limiter. (I assume you uses Kafka) However it only exists in Kafka 0.10 DataStream Connector, not in other versions nor table api. 王雷 于2020年5月14日周四 下午5:31写道: > hi, All > > Does Flink support rate

Flink operator throttle

2020-05-14 Thread 王雷
hi, All Does Flink support rate limitation? How to limit the rate when the external database connected by the sink operator has throughput limitation. Instead of passive back pressure after reaching the limit of the external database, we want to limit rate actively. Thanks Ray

????SQLClient ????????????

2020-05-14 Thread AlfredFeng
Hi All, SQL Client yaml??append1.10 ??kafka source??field0,field_1,field_2...field_n ??n??field0field0,

Re: flink1.10 ES sink 配置输出object字段问题

2020-05-14 Thread Yangze Guo
ES sink 是Json format的,你可以参考 [1] [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connect.html#json-format Best, Yangze Guo On Thu, May 14, 2020 at 3:37 PM XiaChang <13628620...@163.com> wrote: > > hi,大家好 > > > 请问ES index中存在object字段时,flink1.10版本ES sink 的ddl该如何配置。

Re: SQL DDL怎样使用Long类型的时间戳作为事件时间

2020-05-14 Thread Leonard Xu
Hi 目前还不支持,需要自己写个简单的udf转换下, 社区有个issue[1]在跟这个问题了 Best, Leonard Xu [1]https://issues.apache.org/jira/browse/FLINK-16889 > 在 2020年5月14日,10:01,zzh...@foxmail.com 写道: > > Hi,all > kafka消息中存在消息的时间是Long类型,既有也有秒值,毫秒值, > 我的问题是: >

flink1.10 ES sink 配置输出object字段问题

2020-05-14 Thread XiaChang
hi,大家好 请问ES index中存在object字段时,flink1.10版本ES sink 的ddl该如何配置。

Flink SQL????????

2020-05-14 Thread Senior.Hu
Hi All?? FlinkSqlParserImpl.FACTORYFlink DML SQLJoin with Temporal TableLATERAL LEFT JOIN side_room FOR SYSTEM_TIME AS OF a1.proctime as a2 ON a1.rowkey_room = a2.rowkey LEFT JOIN LATERAL `side_room` FOR SYSTEM_TIME

Re: changing the output files names in Streamfilesink from part-00 to something else

2020-05-14 Thread Sivaprasanna
Hi Just shooting away my thoughts. Based on your what you had described so far, I think your objective is to have some unique way to identify/filter the output based on the organization. If that's the case, you can implement a BucketAssigner with the logic to create a bucket key based on the

Re: changing the output files names in Streamfilesink from part-00 to something else

2020-05-14 Thread Jingsong Li
Hi, Dhurandar, Can you describe your needs? Why do you need to modify file names flexibly? What kind of name do you want? Best, Jingsong Lee On Thu, May 14, 2020 at 2:05 AM dhurandar S wrote: > Yes we looked at it , > The problem is the file name gets generated in a dynamic fashion, based on

Re: SQL DDL怎样使用Long类型的时间戳作为事件时间

2020-05-14 Thread fanchuanpo-163
> 在 2020年5月14日,上午11:12,zzh...@foxmail.com 写道: > > 非常感谢Benchao Li,使用UDF测试通过,SQL示例如下: > > CREATE TABLE session_login ( > ,deal_time BIGINT > ,deal_time_obj as DY_FROM_UNIXTIME(deal_time*1000) > ,WATERMARK FOR deal_time_obj AS deal_time_obj - INTERVAL '60' SECOND > )WITH( > ..

Re: Broadcast state vs data enrichment

2020-05-14 Thread Manas Kale
I see, thank you Roman! On Tue, May 12, 2020 at 4:59 PM Khachatryan Roman < khachatryan.ro...@gmail.com> wrote: > Thanks for the clarification. > > Apparently, the second option (with enricher) creates more load by adding > configuration to every event. Unless events are much bigger than the >