Re: [DISCUSS] Drop Savepoint Compatibility with Flink 1.2

2020-02-23 Thread Yu Li
+1 for dropping savepoint compatibility with Flink 1.2. Best Regards, Yu On Sat, 22 Feb 2020 at 22:05, Ufuk Celebi wrote: > Hey Stephan, > > +1. > > Reading over the linked ticket and your description here, I think it makes > a lot of sense to go ahead with this. Since it's possible to

Re: FsStateBackend vs RocksDBStateBackend

2020-02-23 Thread Yu Li
Yes FsStateBackend would be the best fit for state access performance in this case. Just a reminder that FsStateBackend will upload the full dataset to DFS during checkpointing, so please watch the network bandwidth usage and make sure it won't become a new bottleneck. Best Regards, Yu On Fri,

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-23 Thread Yu Li
Congratulations Jingsong! Well deserved. Best Regards, Yu On Mon, 24 Feb 2020 at 14:10, Congxian Qiu wrote: > Congratulations Jingsong! > > Best, > Congxian > > > jincheng sun 于2020年2月24日周一 下午1:38写道: > >> Congratulations Jingsong! >> >> Best, >> Jincheng >> >> >> Zhu Zhu 于2020年2月24日周一

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-23 Thread Congxian Qiu
Congratulations Jingsong! Best, Congxian jincheng sun 于2020年2月24日周一 下午1:38写道: > Congratulations Jingsong! > > Best, > Jincheng > > > Zhu Zhu 于2020年2月24日周一 上午11:55写道: > >> Congratulations Jingsong! >> >> Thanks, >> Zhu Zhu >> >> Fabian Hueske 于2020年2月22日周六 上午1:30写道: >> >>> Congrats Jingsong!

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-23 Thread jincheng sun
Congratulations Jingsong! Best, Jincheng Zhu Zhu 于2020年2月24日周一 上午11:55写道: > Congratulations Jingsong! > > Thanks, > Zhu Zhu > > Fabian Hueske 于2020年2月22日周六 上午1:30写道: > >> Congrats Jingsong! >> >> Cheers, Fabian >> >> Am Fr., 21. Feb. 2020 um 17:49 Uhr schrieb Rong Rong > >: >> >> >

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-23 Thread Zhu Zhu
Congratulations Jingsong! Thanks, Zhu Zhu Fabian Hueske 于2020年2月22日周六 上午1:30写道: > Congrats Jingsong! > > Cheers, Fabian > > Am Fr., 21. Feb. 2020 um 17:49 Uhr schrieb Rong Rong >: > > > Congratulations Jingsong!! > > > > Cheers, > > Rong > > > > On Fri, Feb 21, 2020 at 8:45 AM Bowen Li

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-23 Thread Yang Wang
Hi Singh, Glad to hear that you are looking to run Flink on the Kubernetes. I am trying to answer your question based on my limited knowledge and others could correct me and add some more supplements. I think the biggest difference between session cluster and per-job cluster on Kubernetesis the

Flink读写kafka数据聚集任务失败问题

2020-02-23 Thread chanamper
大家好,请教一下,flink任务读取kafka数据进行聚集操作后将结果写回kafka,flink版本为1.8.0。任务运行一段时间后出现如下异常,之后flink任务异常挂掉,请问一下这个问题该如何解决呢?多谢 2020-02-19 10:45:45,314 ERROR org.apache.flink.runtime.io.network.netty.PartitionRequestQueue - Encountered error while consuming partitions java.io.IOException: Connection reset by peer

Re: yarn session: one JVM per task

2020-02-23 Thread Xintong Song
Hi David, In general, I don't think you can control all parallel subtasks of a certain task run in the same JVM process with the current Flink. If you job scale is very small, one thing you might try is to have only one task manager in the Flink session cluster. You need to make sure the task

[ANNOUNCE] Weekly Community Update 2020/07

2020-02-23 Thread Konstantin Knauf
Dear community, happy to share this week's community digest with updates on the next release cycle, a set of proposal for Flink's web user interface, a couple of discussions around our development process and a bit more. Flink Development == * [releases] Stephan proposes an

Re: timestamp问题

2020-02-23 Thread Jark Wu
Hi Fei, Kafka source/sink 不支持 TIMESTAMP(6) 类型,支持精度3,且现在 TIMESTAMP 不带精度默认是6,所以需要你将 DDL 声明中的 TIMESTAMP 改成 TIMESTAMP(3). Beest, Jark On Sun, 23 Feb 2020 at 15:44, Fei Han wrote: > > Hi,all: >我在zeppelin执行如下DDL和SQL,报如下错误: > DDL: > DROP TABLE IF EXISTS user_log ; > CREATE TABLE user_log ( >

Re: 如果有些map阶段的计算很慢,它发checkpoint也很慢,那么这样会阻塞reduce operator进行后续的操作吗

2020-02-23 Thread Jark Wu
Hi Mark, > taskA1会继续处理cp2的数据吗?如果是继续处理,taskB会处理taskA传递给taskB的cp2的数据吗? A1会继续处理。如果是 exactly-once 模式,taskB 不会处理 taskA传递给taskB的cp2的数据。所以,如果 A2 非常非常慢,最终 taskB 会反压到 A1,导致 A1也无法继续处理数据。 > 同样的问题,如果taskA本身就是一个reduce操作(keyby),taskB是一个map操作。那么同样的问题,答案是一样的吗? 答案一样。 Best, Jark On Sun, 23 Feb 2020 at 19:18,

如果有些map阶段的计算很慢,它发checkpoint也很慢,那么这样会阻塞reduce operator进行后续的操作吗

2020-02-23 Thread Mark Zang
假设一个简单的map和reduce操作。A是Map Operator,B是keyby Operator。 A有两个task:taskA1和taskA2,B只有一个taskB 如果taskA2执行的特别慢,taskA1执行完毕checkpoint cp1后,告诉了taskB,然后已经开始(或者说可以开始)处理下一个checkpoint cp2的数据了。 这时候taskA2还在缓慢的处理cp1的数据。这时候: taskA1会继续处理cp2的数据吗? 如果是继续处理,taskB会处理taskA传递给taskB的cp2的数据吗? 还是taskA1和taskB都停止处理*,*等taskA2?

CfP: Workshop on Large Scale RDF Analytics (LASCAR-20) at ESWC'20

2020-02-23 Thread Hajira Jabeen
We apologize for cross-postings. We appreciate your great help in forwarding this CFP to your colleagues and friends.

Batch reading from Cassandra

2020-02-23 Thread Lasse Nedergaard
Hi. We would like to do some batch analytics on our data set stored in Cassandra and are looking for an efficient way to load data from a single table. Not by key, but random 15%, 50% or 100% Data bricks has create an efficient way to load Cassandra data into Apache Spark and they are doing