Re: Re: flink1.9 Blink sql 丢失主键+去重和时态表联合使用吞吐量低

2020-05-10 Thread 刘大龙
Hi, 我觉得你可以多试几种方式,比如先关联再去重,测试一下性能呢?另外,去重的话,你这个业务逻辑是用es字段排序取最新,不能用procetime去重取最新吗?你的sql中rowNum<=1实际上生成的应该是Rank算子,不是Deduplication算子吧。我在业务中对订单类型的数据是这样用去重算子的,select * from (select *, row_number() over(partition by id order by proctime desc as rowNum from xxx) tmp where rowNum =

Re: Statefun 2.0 questions

2020-05-10 Thread Tzu-Li (Gordon) Tai
Hi, Correct me if I'm wrong, but from the discussion so far it seems like what Wouter is looking for is an HTTP-based ingress / egress. We have been thinking about this in the past. The specifics of the implementation is still to be discussed, but to be able to ensure exactly-once processing

Re: Rich Function Thread Safety

2020-05-10 Thread Tzu-Li (Gordon) Tai
As others have mentioned already, it is true that method calls on operators (e.g. processing events and snapshotting state) will not concurrently happen. As for your findings in reading through the documentation, that might be a hint that we could add a bit more explanation mentioning this. Could

Re: Cannot start native K8s

2020-05-10 Thread Yang Wang
Glad to hear that you could deploy the Flink cluster on K8s natively. Thanks for trying the in-preview feature and give your feedback. Moreover, i want to give a very simple conclusion here. Currently, because of the compatibility issue of fabric8 kubernetes-client, the native K8s integration

Re: flink1.9 Blink sql 丢失主键+去重和时态表联合使用吞吐量低

2020-05-10 Thread 宇张
hi、 我这面state backend用的是FsStateBackend,状态保存在hdfs On Mon, May 11, 2020 at 11:19 AM Benchao Li wrote: > Hi, > > 你用的是什么state backend呢?看你的情况很有可能跟这个有关系。比如用的是rocksdb,然后是普通磁盘的话,很容易遇到IO瓶颈。 > > 宇张 于2020年5月11日周一 上午11:14写道: > > > hi、 > > 我这面使用flink1.9的Blink sql完成数据转换操作,但遇到如下问题: > > 1、使用row_number函数丢失主键 >

Re: flink1.9 Blink sql 丢失主键+去重和时态表联合使用吞吐量低

2020-05-10 Thread Benchao Li
Hi, 你用的是什么state backend呢?看你的情况很有可能跟这个有关系。比如用的是rocksdb,然后是普通磁盘的话,很容易遇到IO瓶颈。 宇张 于2020年5月11日周一 上午11:14写道: > hi、 > 我这面使用flink1.9的Blink sql完成数据转换操作,但遇到如下问题: > 1、使用row_number函数丢失主键 > 2、row_number函数和时态表关联联合使用程序吞吐量严重降低,对应sql如下: > // 理论上这里面是不需要 distinct的,但sql中的主键blink提取不出来导致校验不通过,所以加了一个 > SELECT

flink1.9 Blink sql 丢失主键+去重和时态表联合使用吞吐量低

2020-05-10 Thread 宇张
hi、 我这面使用flink1.9的Blink sql完成数据转换操作,但遇到如下问题: 1、使用row_number函数丢失主键 2、row_number函数和时态表关联联合使用程序吞吐量严重降低,对应sql如下: // 理论上这里面是不需要 distinct的,但sql中的主键blink提取不出来导致校验不通过,所以加了一个 SELECT distinct t1.id as order_id,...,DATE_FORMAT(t1.proctime,'-MM-dd HH:mm:ss') as etl_time FROM (select id,...,proctime from

sql topN ArrayIndexOutOfBoundsException

2020-05-10 Thread 1101300123
我的flink SQL 的时候使用了topN语法,在程序运行一段时间后出现了如下错误: [flink-akka.actor.default-dispatcher-8695] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Rank(strategy=[AppendFastStrategy], rankType=[ROW_NUMBER], rankRange=[rankStart=1, rankEnd=1], partitionBy=[contact_id, service_no],

How to get the applicationid internally when the Flink is running on yarn?

2020-05-10 Thread 462329521
Hi,How to get the applicationid internally when the Flink is running on yarn? Thanks.

Re:flink1.9,FSstatebackend,checkpoint失败

2020-05-10 Thread guanyq
flink版本:1.9 stateBackEnd:FsStateBackEnd 附件: checkpoint失败截图及日志,应该如何分析解决这种失败?2020-05-10 21:22:09,336 INFO org.apache.flink.yarn.YarnTaskExecutorRunner - 2020-05-10 21:22:09,337 INFO

flink1.9,FSstatebackend,checkpoint失败

2020-05-10 Thread guanyq
flink版本:1.9 stateBackEnd:FsStateBackEnd 附件: checkpoint失败截图及日志,应该如何分析解决这种失败? 2020-05-10 21:22:09,336 INFO org.apache.flink.yarn.YarnTaskExecutorRunner - 2020-05-10 21:22:09,337 INFO

flink1.9,FSstatebackend,checkpoint相关问题

2020-05-10 Thread guanyq
附件为日志,麻烦帮忙分析一下。 2020-05-10 21:22:09,336 INFO org.apache.flink.yarn.YarnTaskExecutorRunner - 2020-05-10 21:22:09,337 INFO org.apache.flink.yarn.YarnTaskExecutorRunner - Starting

[ANNOUNCE] Weekly Community Update 2020/19

2020-05-10 Thread Konstantin Knauf
Dear community, with only a few days left until the planned feature freeze for Flink 1.11 the dev@ mailing list is getting pretty quite while everyone is working on their final features. Hence, this week only a short update. Flink Development == * [releases] Yu has published RC #3

Re: Flink consuming rate increases slowly

2020-05-10 Thread Chen Qin
Hi Eyal, It’s unclear what warmup phase does in your use cases. Usually we see Flink start consume at high rate and drop to a point downstream can handle. Thanks Chen > On May 10, 2020, at 12:25 AM, Eyal Pe'er wrote: > > Hi all, > Lately I've added more resources to my Flink cluster which

Flink consuming rate increases slowly

2020-05-10 Thread Eyal Pe'er
Hi all, Lately I've added more resources to my Flink cluster which required a restart of all apps. >From the cluster side, the only change I made, is to add more task slots. On the cluster I have a streaming app that consumes from Kafka and sinks to files. I noticed that since the restart, the