Re: How to handle Flink Job with 400MB+ Uberjar with 800+ containers ?

2019-08-29 Thread SHI Xiaogang
Hi Datashov, We faced similar problems in our production clusters. Now both lauching and stopping of containers are performed in the main thread of YarnResourceManager. As containers are launched and stopped one after another, it usually takes long time to boostrap large jobs. Things get worse

How to handle Flink Job with 400MB+ Uberjar with 800+ containers ?

2019-08-29 Thread Elkhan Dadashov
Dear Flink developers, Having difficulty of getting a Flink job started. The job's uberjar/fat jar is around 400MB, and I need to kick 800+ containers. The default HDFS replication is 3. *The Yarn queue is empty, and 800 containers are allocated almost immediately by Yarn RM.* It takes

Re: Flink and kerberos

2019-08-29 Thread Vishwas Siravara
Thanks, I'll check it out. On Thu, Aug 29, 2019 at 1:08 PM David Morin wrote: > Vishwas, > > A config that works on my Kerberized cluster (Flink on Yarn). > I hope this will help you. > > Flink conf: > security.kerberos.login.use-ticket-cache: true > security.kerberos.login.keytab:

Re: Flink and kerberos

2019-08-29 Thread David Morin
Vishwas, A config that works on my Kerberized cluster (Flink on Yarn). I hope this will help you. Flink conf: security.kerberos.login.use-ticket-cache: true security.kerberos.login.keytab: /home/myuser/myuser.keytab security.kerberos.login.principal: myuser@ security.kerberos.login.contexts:

Re: Flink 1.9, MapR secure cluster, high availability

2019-08-29 Thread Stephan Ewen
Hi Maxim! The change of the MapR dependency should not have an impact on that. Do you know if the same thing worked in prior Flink versions? Is that a regression in 1.9? The exception that you report, is that from Flink's HA services trying to connect to ZK, or from the MapR FS client trying to

Re: End of Window Marker

2019-08-29 Thread Eduardo Winpenny Tejedor
Hi, I'll chip in with an approach I'm trying at the moment that seems to work, and I say seems because I'm only running this on a personal project. Personally, I don't have anything against end-of-message markers per partition, Padarn you seem to not prefer this option as it overloads the

Re: Flink and kerberos

2019-08-29 Thread Vishwas Siravara
Hey David , My consumers are registered , here is the debug log. The problem is the broker does not belong to me , so I can’t see what is going on there . But this is a new consumer group , so there is no state yet . org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer

Re: Kafka consumer behavior with same group Id, 2 instances of apps in separate cluster?

2019-08-29 Thread Becket Qin
Hi Ashish, You are right. Flink does not use Kafka based group management. So if you have two clusters consuming the same topic, they will not divide the partitions. The cross cluster HA is not quite possible at this point. It would be good to know the reason you want to have such HA and see if

Re: Flink and kerberos

2019-08-29 Thread David Morin
Hello Vishwas, You can use a keytab if you prefer. You generate a keytab for your user and then you can reference it in the Flink configuration. Then this keytab will be handled by Flink in a secure way and TGT will be created based on this keytab. However, that seems to be working. Did you check

Re: problem with avro serialization

2019-08-29 Thread Debasish Ghosh
Any update on this ? regards. On Tue, May 14, 2019 at 2:22 PM Tzu-Li (Gordon) Tai wrote: > Hi, > > Aljoscha opened a JIRA just recently for this issue: > https://issues.apache.org/jira/browse/FLINK-12501. > > Do you know if this is a regression from previous Flink versions? > I'm asking just

flink to HDFS Couldn't create proxy provider class ***.ha.ConfiguredFailoverProxyProvider

2019-08-29 Thread 马晓稳
Dear All 【我想咨询一下flink消费数据入HDFS的一个问题】 1. 因为我们有多个HDFS集群 2. 我采用的是将hdfs-site.xml和core-site.xml放到resources文件下 【问题】 java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider at

flink to HDFS Couldn't create proxy provider class ***.ha.ConfiguredFailoverProxyProvider

2019-08-29 Thread 马晓稳
Dear All 【我想咨询一下flink消费数据入HDFS的一个问题】 1. 因为我们有多个HDFS集群 2. 我采用的是将hdfs-site.xml和core-site.xml放到resources文件下 【问题】 java.io.IOException: Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider at

Re: Kafka consumer behavior with same group Id, 2 instances of apps in separate cluster?

2019-08-29 Thread ashish pok
Looks like Flink is using “assign” partitions instead of “subscribe” which will not allow participating in a group if I read the code correctly.  Has anyone solved this type of problem in past of active-active HA across 2 clusters using Kafka?  - Ashish On Wednesday, August 28, 2019, 6:52 PM,

Left Anti-Join

2019-08-29 Thread ddwcg
hi, I want to calculate the amount of consumption relative to the user added in the previous year, but the result of the following sql calculation is incorrect. The "appendTable" is a table register from a appendStream select a.years,a.shopId,a.userId, a.amount from (select

Re: Loading dylibs

2019-08-29 Thread Yang Wang
Hi Vishwas, I think it just because dylib is loaded more than once in a jvm process(TaskManager). Multiple tasks are deployed in one TaskManager and running in different threads. So if you want to make the dylib only loaded once, maybe you use the parent classloader. You could use the the

Re: 关于flink 写于kafka时的transactionId 生成问题

2019-08-29 Thread Wesley Peng
Hi on 2019/8/29 17:50, ddwcg wrote: broker就一个,flink集群的时钟确实和broker的不一样,是不同的机房,能自己指定transactionalId吗,两个机房的调整成一样怕影响其他的应用 AFAIK the transID is generated by systems. regards.

Re: 关于flink 写于kafka时的transactionId 生成问题

2019-08-29 Thread ddwcg
broker就一个,flink集群的时钟确实和broker的不一样,是不同的机房,能自己指定transactionalId吗,两个机房的调整成一样怕影响其他的应用 > 在 2019年8月29日,17:45,Wesley Peng 写道: > > Hi > > on 2019/8/29 17:13, ddwcg wrote: >> 作业我已经停止了,但是看kafka的日志还是不断的在刷Initialized transactionalId………. ,而且该程序再此启动就会报: >> Caused by:

Re: 关于flink 写于kafka时的transactionId 生成问题

2019-08-29 Thread Wesley Peng
Hi on 2019/8/29 17:13, ddwcg wrote: 作业我已经停止了,但是看kafka的日志还是不断的在刷Initialized transactionalId………. ,而且该程序再此启动就会报: Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer attempted an operation with an old epoch. Either there is a newer producer with the same transactionalId,

关于flink 写于kafka时的transactionId 生成问题

2019-08-29 Thread ddwcg
hi, 在写入kafka的时候自动生成了一个transactionId,请问这个id生成的方式是什么,我自己指定好像并不起作用。 作业我已经停止了,但是看kafka的日志还是不断的在刷Initialized transactionalId………. ,而且该程序再此启动就会报: Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer attempted an operation with an old epoch. Either there is a newer producer with

一个FlinkJob消费多个kafka topic消息问题

2019-08-29 Thread 史 正超
1.平常工作中经常会有同一个统计表中会包含多个不同的统计指标,比如:post_count, send_count 2.然而这些指标来自不同的kafka 消息体 3.有没有在不用uninon all的情况下,向sink 表中写入各自topic的数据,因为union all有很多0值来填充

Re: Flink and kerberos

2019-08-29 Thread Vishwas Siravara
I see this log as well , but I can't see any messages . I know for a fact that the topic I am subscribed to has messages as I checked with a simple java consumer with a different group. org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer subtask 0 will start reading

Re: Kafka producer failed with InvalidTxnStateException when performing commit transaction

2019-08-29 Thread Tony Wei
Hi, Has anyone run into the same problem? I have updated my producer transaction timeout to 1.5 hours, but the problem sill happened when I restarted broker with active controller. It might not due to the problem that checkpoint duration is too long causing transaction timeout. I had no more clue

????????????state??????checkpoint size????????????????????

2019-08-29 Thread Bo?????WJay
??Flink processFunction??state??flatmapendTimekey+endTimekeybyprocessFunction open()value

关于定期清理state,但是checkpoint size大小一直在增加的疑问

2019-08-29 Thread 陈赋赟
各位好,本人在使用Flink processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction open()里我初始化了value state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话 那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint

关于定期清理state,但是checkpoint size大小一直在增加的疑问

2019-08-29 Thread 陈赋赟
各位好,本人在使用Flink processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction open()里我初始化了value state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话 那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint

关于定期清理state,但是checkpoint size大小一直在增加的疑问

2019-08-29 Thread 陈赋赟
各位好,本人在使用Flink processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction open()里我初始化了value state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话 那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint

关于定期清理state,但是checkpoint size大小一直在增加的疑问

2019-08-29 Thread 陈赋赟
各位好,本人在使用Flink processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction open()里我初始化了value state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话 那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint

Flink and kerberos

2019-08-29 Thread Vishwas Siravara
Hi guys, I am using kerberos for my kafka source. I pass the jaas config and krb5.conf in the env.java.opts: -Dconfig.resource=qa.conf -Djava.library.path=/usr/mware/SimpleAPI/voltage-simple-api-java-05.12.-Linux-x86_64-64b-r234867/lib/

Re: flink不清理state,从checkpoint恢复任务能重置kafka的offset讨论

2019-08-29 Thread 蒋涛涛
Hi 唐云 你这个方法我可以尝试下(目前我使用的是flink 1.6.2 ) PS:flink 1.9 的 state processor api,应该可以直接修改 savepoint 中的数据,修改下 kafka 的 offset 祝好 蒋涛涛 Yun Tang 于2019年8月29日周四 下午12:12写道: > Hi 蒋涛涛 > >

Re: 全局并行度和算子并行度的关系

2019-08-29 Thread ddwcg
谢谢您的回复,那如果启动的时候只给了一个solt,算子并行度设置为2,最终也是按并行度为1去执行呢 > 在 2019年8月29日,10:54,pengcheng...@bonc.com.cn 写道: > > 你好,以我的理解,并行度的优先级setParallelism>命令>配置文件。 > 每个算子有多个并行度的话,每个并行度占一个slot。 > flink sql无法设置并行度。 > > > > pengcheng...@bonc.com.cn > > 发件人: ddwcg > 发送时间: 2019-08-29 10:18 > 收件人: user-zh > 主题: