Re: flink1.10.1/1.11.1 使用sql 进行group 和 时间窗口 操作后 状态越来越大

2020-12-25 文章 Storm☀️
在测试环境:
关闭增量chk,全量的state大小大约在:100M左右;
之前开启:我观察了一段时间,膨胀到5G,而且还一直在增长;
sql:
select sum(xx) group by 1 分钟窗口

过期时间设置的为:5-30min



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink1.10.1/1.11.1 使用sql 进行group 和 时间窗口 操作后 状态越来越大

2020-12-22 文章 Storm☀️
唐云大佬好,
我关闭了chk的增量模式之后,chkstate确实不会再无线膨胀了。这个是我配置的不准确,还是一个已知问题呢



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink1.10.1/1.11.1 使用sql 进行group 和 时间窗口 操作后 状态越来越大

2020-12-22 文章 Storm☀️
"计算机的解决方案是 出现问题,大都是保护现场,等问题解决后,释放现场 "
 那么解决问题的方法是?生产上state还在不断膨胀。
简单一个问题,生产上发生OOM了,短时间内无法排查出原因,请问如何处理?



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: 执行mvn构建错误

2020-12-22 文章 Storm☀️
看了下找不到的包是example相关的包(不影响core相关代码),可以从别处下载下来,然后添加到本地maven库内。



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: 请教一下flink1.12可以指定时间清除state吗?

2020-12-22 文章 Storm☀️
增加一个定时器,可以指定时间做清理动作。可以参考flink中window trigger的相关代码和实现。



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Flink1.10.1代码自定义rocksdb backend webui中还是yaml显示默认的配置

2020-12-22 文章 Storm☀️
flink1.10.1
在代码中通过如下方式指定backend
streamEnv.setStateBackend(new
RocksDBStateBackend("hdfs:///user/flink/flink-checkpoints", false));

问题:flink web ui jm中显示的依然为yaml中默认配置
实际上面代码中的配置已经生效



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: flink1.10.1/1.11.1 使用sql 进行group 和 时间窗口 操作后 状态越来越大

2020-12-17 文章 Storm☀️
state.backend.incremental 出现问题的时候增量模式是开启的吗?



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: Re:Re: flink sql作业state size一直增加

2020-12-17 文章 Storm☀️
mini batch默认为false 。题主问题找到了吗



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: 使用sql时候,设置了idle过期时间,但是状态还是一直变大

2020-12-17 文章 Storm☀️
flink 1.10.1 同样遇到这个问题 设置了ttl但是没有生效,请问题主解决该问题了吗?
*sql*:
select
* 
from 
xx
group by 
TUMBLE(monitor_processtime, INTERVAL '60' SECOND),topic_identity

*60s的窗口,设置的过期时间是2分钟,但是checkpoint中状态还是在变大*

*tEnv.getConfig().setIdleStateRetentionTime(Time.minutes(2),
Time.minutes(5)); *
   



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: flink1.11日志上报

2020-10-27 文章 Storm☀️
我们也是用的kafkaappender进行日志上报,然后在ES中提供日志检索



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: Re:Re: Flink 1.10.1 checkpoint失败问题

2020-10-14 文章 Storm☀️
非常感谢。
后续我关注下这个问题,有结论反馈给大家,供参考。



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Flink 1.10.1 checkpoint失败问题

2020-10-13 文章 Storm☀️
flink版本:Flink1.10.1 
部署方式:flink on yarn
hadoop版本:cdh5.15.2-2.6.0
现状:Checkpoint CountsTriggered: 9339In Progress: 0Completed: 8439Failed:
900Restored: 7
错误信息:
ava.lang.Exception: Could not perform checkpoint 1194 for operator Map
(3/3).
at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:816)
at
org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:86)
at
org.apache.flink.streaming.runtime.io.CheckpointBarrierTracker.processBarrier(CheckpointBarrierTracker.java:99)
at
org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:155)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:133)
at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:310)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:485)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:469)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1382)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:974)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$5(StreamTask.java:870)
at
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:843)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:803)
... 12 more


同样的程序在11.2的版本上,chk是完全正常的。





--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: Flink 1.10.1 checkpoint失败问题

2020-10-09 文章 Storm☀️
尝试了将jdk升级到了261,报错依然还有。



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: Flink 1.10.1 checkpoint失败问题

2020-10-08 文章 Storm☀️
谢谢
我看了那个issue,有问题的是jdk 1.8_060版本的,我们用的是074版本的。
我测试环境尝试升级一下jdk到251版本。



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Flink 1.10.1 checkpoint失败问题

2020-09-26 文章 Storm☀️
各位好,checkpoint相关问题L

flink版本1.10.1:,个别的checkpoint过程发生问题:
java.lang.Exception: Could not perform checkpoint 1194 for operator Map
(3/3).
at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:816)
at
org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:86)
at
org.apache.flink.streaming.runtime.io.CheckpointBarrierTracker.processBarrier(CheckpointBarrierTracker.java:99)
at
org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:155)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:133)
at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:310)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:485)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:469)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1382)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:974)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$5(StreamTask.java:870)
at
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:843)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:803)
... 12 mor

绝大部分是正常完成的,但是小部分比如上面的情况,就会失败,还会导致suspending-->restart.



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: Flink 1.10.1 cdh-hdfs启用HA时,flink job提交报错

2020-09-03 文章 Storm☀️
问题找到了;
hdfs-site.xml配置文件冲突导致的。

原因:通过-yt上传了 外部集群的hdfs-site.xml文件。

flink10初始化taskmanager读取 hdfs-site.xml配置的时候被外部的hdfs-site.xml文件干扰。

此问题终结。



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink-1.11 sql写ES6问题

2020-09-02 文章 Storm☀️
可能是jar包冲突。



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Re: Re:Flink 1.10.1 cdh-hdfs启用HA时,flink job提交报错

2020-09-02 文章 Storm☀️
现在的配置是这样的,没有添加namenode+ip;
jobmanager.archive.fs.dir: hdfs:///completed-jobs/

需要改成:
hdfs://nameservice2/completed-jobs/  这样的吗?


感觉是创建fs的时候错了。看到这部分异常:
createNonHAProxy




--
Sent from: http://apache-flink.147419.n8.nabble.com/


Flink 1.10.1 cdh-hdfs启用HA时,flink job提交报错

2020-09-02 文章 storm


各位老师好,在HDFS上开启HA的时候,向yarn提交任务的时候,遇到点问题。
cdh版本:5.15.2
hdfs版本:2.6.0
启动模式:flink-on-yarn
配置了HADOOP_CONF_DIR=/etc/hadoop/conf
命令:
./bin/flink run -m yarn-cluster -yt /yarn-conf -p 3 -ytm 2048 -ys 1 -ynm xxx
/jars/flink10.jar xxx

HDFS不启用HA的时候,能正常提交。

提交任务到yarn的时候,出现如下异常:nameservice2 是HA配置时候自定义的nameservice

2020-09-02 14:53:08,118 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote keytab path obtained null
2020-09-02 14:53:08,119 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote keytab principal obtained null
2020-09-02 14:53:08,119 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote yarn conf path obtained null
2020-09-02 14:53:08,119 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote krb5 path obtained null
2020-09-02 14:53:08,120 ERROR org.apache.flink.yarn.YarnResourceManager  -
Could not start TaskManager in container
container_1598944802155_0042_01_06.
java.lang.IllegalArgumentException: java.net.UnknownHostException:
nameservice2
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:312)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:178)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:665)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:601)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:469)
at
org.apache.flink.yarn.YarnResourceManager.createTaskExecutorLaunchContext(YarnResourceManager.java:582)
at
org.apache.flink.yarn.YarnResourceManager.startTaskExecutorInContainer(YarnResourceManager.java:384)
at java.lang.Iterable.forEach(Iterable.java:75)
at
org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:366)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
at
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive(Actor.scala:517)
at akka.actor.Actor.aroundReceive$(Actor.scala:515)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.UnknownHostException: nameservice2
... 41 more



--
Sent from: http://apache-flink.147419.n8.nabble.com/


Flink 1.10.1 cdh-hdfs启用HA时,flink job提交报错

2020-09-02 文章 Storm☀️
各位老师,好。
Flink on  yarn 模式提交
hadoop:2.6.0
cdh 5.15.2
HADOOP_CONF_DIR=/etc/hadoop/conf

在cdh上开启hdfs的HA之后提交任务报错,不开启HA能正常提交任务。
启动方式:
/bin/flink run -m yarn-cluster -yt /yarn-conf -p 3 -ytm 2048 -ys 1 -ynm xxx
/jars/flink10.jar  xx
报错信息r如下:


2020-09-02 14:53:08,118 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote keytab path obtained null
2020-09-02 14:53:08,119 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote keytab principal obtained null
2020-09-02 14:53:08,119 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote yarn conf path obtained null
2020-09-02 14:53:08,119 DEBUG org.apache.flink.yarn.YarnResourceManager  -
TM:remote krb5 path obtained null
2020-09-02 14:53:08,120 ERROR org.apache.flink.yarn.YarnResourceManager  -
Could not start TaskManager in container
container_1598944802155_0042_01_06.
java.lang.IllegalArgumentException: java.net.UnknownHostException:
nameservice2
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:312)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:178)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:665)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:601)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:469)
at
org.apache.flink.yarn.YarnResourceManager.createTaskExecutorLaunchContext(YarnResourceManager.java:582)
at
org.apache.flink.yarn.YarnResourceManager.startTaskExecutorInContainer(YarnResourceManager.java:384)
at java.lang.Iterable.forEach(Iterable.java:75)
at
org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:366)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
at
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive(Actor.scala:517)
at akka.actor.Actor.aroundReceive$(Actor.scala:515)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.UnknownHostException: nameservice2
... 41 more



--
Sent from: http://apache-flink.147419.n8.nabble.com/


ceshi

2020-09-02 文章 Storm☀️
测试内容



--
Sent from: http://apache-flink.147419.n8.nabble.com/