Hi Datashov,
We faced similar problems in our production clusters.
Now both lauching and stopping of containers are performed in the main
thread of YarnResourceManager. As containers are launched and stopped one
after another, it usually takes long time to boostrap large jobs. Things
get worse
Dear Flink developers,
Having difficulty of getting a Flink job started.
The job's uberjar/fat jar is around 400MB, and I need to kick 800+
containers.
The default HDFS replication is 3.
*The Yarn queue is empty, and 800 containers are allocated
almost immediately by Yarn RM.*
It takes
Thanks, I'll check it out.
On Thu, Aug 29, 2019 at 1:08 PM David Morin
wrote:
> Vishwas,
>
> A config that works on my Kerberized cluster (Flink on Yarn).
> I hope this will help you.
>
> Flink conf:
> security.kerberos.login.use-ticket-cache: true
> security.kerberos.login.keytab:
Vishwas,
A config that works on my Kerberized cluster (Flink on Yarn).
I hope this will help you.
Flink conf:
security.kerberos.login.use-ticket-cache: true
security.kerberos.login.keytab: /home/myuser/myuser.keytab
security.kerberos.login.principal: myuser@
security.kerberos.login.contexts:
Hi Maxim!
The change of the MapR dependency should not have an impact on that.
Do you know if the same thing worked in prior Flink versions? Is that a
regression in 1.9?
The exception that you report, is that from Flink's HA services trying to
connect to ZK, or from the MapR FS client trying to
Hi,
I'll chip in with an approach I'm trying at the moment that seems to work,
and I say seems because I'm only running this on a personal project.
Personally, I don't have anything against end-of-message markers per
partition, Padarn you seem to not prefer this option as it overloads the
Hey David ,
My consumers are registered , here is the debug log. The problem is the
broker does not belong to me , so I can’t see what is going on there . But
this is a new consumer group , so there is no state yet .
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase -
Consumer
Hi Ashish,
You are right. Flink does not use Kafka based group management. So if you
have two clusters consuming the same topic, they will not divide the
partitions. The cross cluster HA is not quite possible at this point. It
would be good to know the reason you want to have such HA and see if
Hello Vishwas,
You can use a keytab if you prefer. You generate a keytab for your user and
then you can reference it in the Flink configuration.
Then this keytab will be handled by Flink in a secure way and TGT will be
created based on this keytab.
However, that seems to be working.
Did you check
Any update on this ?
regards.
On Tue, May 14, 2019 at 2:22 PM Tzu-Li (Gordon) Tai
wrote:
> Hi,
>
> Aljoscha opened a JIRA just recently for this issue:
> https://issues.apache.org/jira/browse/FLINK-12501.
>
> Do you know if this is a regression from previous Flink versions?
> I'm asking just
Dear All
【我想咨询一下flink消费数据入HDFS的一个问题】
1. 因为我们有多个HDFS集群
2. 我采用的是将hdfs-site.xml和core-site.xml放到resources文件下
【问题】
java.io.IOException: Couldn't create proxy provider class
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
at
Dear All
【我想咨询一下flink消费数据入HDFS的一个问题】
1. 因为我们有多个HDFS集群
2. 我采用的是将hdfs-site.xml和core-site.xml放到resources文件下
【问题】
java.io.IOException: Couldn't create proxy provider class
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
at
Looks like Flink is using “assign” partitions instead of “subscribe” which will
not allow participating in a group if I read the code correctly.
Has anyone solved this type of problem in past of active-active HA across 2
clusters using Kafka?
- Ashish
On Wednesday, August 28, 2019, 6:52 PM,
hi,
I want to calculate the amount of consumption relative to the user added in the
previous year, but the result of the following sql calculation is incorrect.
The "appendTable" is a table register from a appendStream
select a.years,a.shopId,a.userId, a.amount
from (select
Hi Vishwas,
I think it just because dylib is loaded more than once in a jvm
process(TaskManager).
Multiple tasks are deployed in one TaskManager and running in different
threads.
So if you want to make the dylib only loaded once, maybe you use the parent
classloader.
You could use the the
Hi
on 2019/8/29 17:50, ddwcg wrote:
broker就一个,flink集群的时钟确实和broker的不一样,是不同的机房,能自己指定transactionalId吗,两个机房的调整成一样怕影响其他的应用
AFAIK the transID is generated by systems.
regards.
broker就一个,flink集群的时钟确实和broker的不一样,是不同的机房,能自己指定transactionalId吗,两个机房的调整成一样怕影响其他的应用
> 在 2019年8月29日,17:45,Wesley Peng 写道:
>
> Hi
>
> on 2019/8/29 17:13, ddwcg wrote:
>> 作业我已经停止了,但是看kafka的日志还是不断的在刷Initialized transactionalId………. ,而且该程序再此启动就会报:
>> Caused by:
Hi
on 2019/8/29 17:13, ddwcg wrote:
作业我已经停止了,但是看kafka的日志还是不断的在刷Initialized transactionalId………. ,而且该程序再此启动就会报:
Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer
attempted an operation with an old epoch. Either there is a newer producer with
the same transactionalId,
hi,
在写入kafka的时候自动生成了一个transactionId,请问这个id生成的方式是什么,我自己指定好像并不起作用。
作业我已经停止了,但是看kafka的日志还是不断的在刷Initialized transactionalId………. ,而且该程序再此启动就会报:
Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer
attempted an operation with an old epoch. Either there is a newer producer with
1.平常工作中经常会有同一个统计表中会包含多个不同的统计指标,比如:post_count, send_count
2.然而这些指标来自不同的kafka 消息体
3.有没有在不用uninon all的情况下,向sink 表中写入各自topic的数据,因为union all有很多0值来填充
I see this log as well , but I can't see any messages . I know for a fact
that the topic I am subscribed to has messages as I checked with a simple
java consumer with a different group.
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase -
Consumer subtask 0 will start reading
Hi,
Has anyone run into the same problem? I have updated my producer
transaction timeout to 1.5 hours,
but the problem sill happened when I restarted broker with active
controller. It might not due to the
problem that checkpoint duration is too long causing transaction timeout. I
had no more clue
??Flink
processFunction??state??flatmapendTimekey+endTimekeybyprocessFunction
open()value
各位好,本人在使用Flink
processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction
open()里我初始化了value
state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话
那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint
各位好,本人在使用Flink
processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction
open()里我初始化了value
state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话
那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint
各位好,本人在使用Flink
processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction
open()里我初始化了value
state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话
那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint
各位好,本人在使用Flink
processFunction来管理state数据的时候产生了一些疑问想要请教各位,我在流里使用flatmap为数据分配了多个endTime,然后根据数据的key+endTime去做keyby,在processFunction
open()里我初始化了value
state,并在processElement()里注册了清理state的Timer,在ontimer()里清除state,按照我的理解,如果state状态定时清理掉的话
那checkpoint size会一直保持在一个稳定的大小,但是流现在已经跑了8天,我看checkpoint
Hi guys,
I am using kerberos for my kafka source. I pass the jaas config and
krb5.conf in the env.java.opts: -Dconfig.resource=qa.conf
-Djava.library.path=/usr/mware/SimpleAPI/voltage-simple-api-java-05.12.-Linux-x86_64-64b-r234867/lib/
Hi 唐云
你这个方法我可以尝试下(目前我使用的是flink 1.6.2 )
PS:flink 1.9 的 state processor api,应该可以直接修改 savepoint 中的数据,修改下 kafka 的
offset
祝好
蒋涛涛
Yun Tang 于2019年8月29日周四 下午12:12写道:
> Hi 蒋涛涛
>
>
谢谢您的回复,那如果启动的时候只给了一个solt,算子并行度设置为2,最终也是按并行度为1去执行呢
> 在 2019年8月29日,10:54,pengcheng...@bonc.com.cn 写道:
>
> 你好,以我的理解,并行度的优先级setParallelism>命令>配置文件。
> 每个算子有多个并行度的话,每个并行度占一个slot。
> flink sql无法设置并行度。
>
>
>
> pengcheng...@bonc.com.cn
>
> 发件人: ddwcg
> 发送时间: 2019-08-29 10:18
> 收件人: user-zh
> 主题:
30 matches
Mail list logo