The Apache Flink community is very happy to announce the release of Apache
Flink 1.17.2, which is the second bugfix release for the Apache Flink 1.17
series.
Apache Flink® Is a framework and distributed processing engine for stateful
computations over unbounded and bounded data streams. Flink
The Apache Flink community is very happy to announce the release of Apache
Flink 1.17.2, which is the second bugfix release for the Apache Flink 1.17
series.
Apache Flink® Is a framework and distributed processing engine for stateful
computations over unbounded and bounded data streams. Flink
Congratulations!
Unlike other data-lakes, Paimon might be the first one to act as a stream-first
(not batch-first) data-lake.
Best
Yun Tang
From: Xianxun Ye
Sent: Tuesday, March 28, 2023 10:52
To: d...@flink.apache.org
Cc: Yu Li ; user ; user-zh
; d
Congratulations!
Unlike other data-lakes, Paimon might be the first one to act as a stream-first
(not batch-first) data-lake.
Best
Yun Tang
From: Xianxun Ye
Sent: Tuesday, March 28, 2023 10:52
To: d...@flink.apache.org
Cc: Yu Li ; user ; user-zh
; d
Thanks Yuanfei for driving the frocksdb release!
Best
Yun Tang
From: Yuan Mei
Sent: Tuesday, January 31, 2023 15:09
To: Jing Ge
Cc: Yanfei Lei ; d...@flink.apache.org
; user ; user-zh@flink.apache.org
Subject: Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0
Thanks Yuanfei for driving the frocksdb release!
Best
Yun Tang
From: Yuan Mei
Sent: Tuesday, January 31, 2023 15:09
To: Jing Ge
Cc: Yanfei Lei ; d...@flink.apache.org
; user ; user...@flink.apache.org
Subject: Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0
care about the checkpoint size too
much. Instead, we should care more about the output results.
Best
Yun Tang
From: Martijn Visser
Sent: Wednesday, October 19, 2022 22:03
To: Puneet Duggal
Cc: user
Subject: Re: Flink CEP Incremental Checkpoint Issue
Hi,
Given
va/org/apache/flink/streaming/examples/wordcount/WordCount.java#L98
Best
Yun Tang
From: zhanghao.c...@outlook.com
Sent: Thursday, September 15, 2022 0:03
To: Hailu, Andreas ; user@flink.apache.org
Subject: Re: ExecutionMode in ExecutionConfig
It's added in Flin
acks rich information on
>FlinkJobListener just as Feng mentioned, which has been supported well by
>Spark, to send data lineage to external systems.
[1] https://feature-requests.datahubproject.io/p/flink-integration
Best
Yun Tang
From: wangqinghuan <
Hi,育锋
我觉得你的分析应该是没问题的。可以创建一个ticket来修复该问题。另外,关于代码实现的具体讨论,建议在dev邮件列表讨论。
祝好
唐云
From: 朱育锋
Sent: Tuesday, June 14, 2022 19:33
To: user-zh@flink.apache.org
Subject: 怀疑源码中的一个方法是never reached code
Hello Everyone
/iterator.cc#L239-L245
Best
Yun Tang
From: Martijn Visser
Sent: Monday, June 13, 2022 21:47
To: Mike Barborak
Cc: user@flink.apache.org
Subject: Re: NegativeArraySizeException trying to take a savepoint
Hi Mike,
It would be worthwhile to check if this still occurs
Hi
请使用英文在dev社区发送邮件。另外关于使用方面的问题,建议向user-zh 频道发送,已经帮你转发到相关邮件列表了。
祝好
唐云
From: Lose control ./ <286296...@qq.com.INVALID>
Sent: Tuesday, May 24, 2022 9:15
To: dev
Subject: 1.13.5版本sql大小64k限制bug
请问各位大神,1.13.5版本sql大小64k限制如何修改啊?谢谢
/frocksdb/blob/8608d75d85f8e1b3b64b73a4fb6d19baec61ba5c/java/src/main/java/org/rocksdb/DBOptions.java#L520
Best
Yun Tang
From: Alexis Sarda-Espinosa
Sent: Tuesday, May 3, 2022 8:47
To: Peter Brucia
Cc: user@flink.apache.org
Subject: RE: RocksDB's state size discrepancy
to set up our own slack workspace.
Best
Yun Tang
From: Jingsong Li
Sent: Thursday, May 12, 2022 10:49
To: Xintong Song
Cc: dev ; user
Subject: Re: [Discuss] Creating an Apache Flink slack workspace
Hi all,
Regarding using ASF slack. I share the problems I saw
the compatibility problem, we also suggest you to use
customized serializers for your customized class for better performance.
Best
Yun Tang
From: Liting Liu (litiliu)
Sent: Friday, May 6, 2022 10:20
To: user@flink.apache.org
Subject: Failed to restore from ck, because
ttps://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-memory-partitioned-index-filters
Best
Yun Tang
From: Yaroslav Tkachenko
Sent: Thursday, April 21, 2022 0:44
To: Trystan
Cc: user
Subject: Re: RocksDB effici
this is really
useful for Flink as this could push data to the last level, which leads to
increase the read amplification.
[1]
https://javadoc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.html
Best
Yun Tang
From: Alexis Sarda-Espinosa
Sent: Friday, April 8
whether the sink back-pressured the source to impact
the throughput of source.
Last but not least, did your source already have 100% CPU usage, which means
your source operator has already reached to its highest throughput.
Best
Yun Tang
From: Sigalit Eliazov
Hi,
RocksDB 的CPU栈能卡在100%,很有可能是大量解压缩 index/filter block导致的,可以enable partition
index/filter [1] 看看问题是否解决。
相关内容也可以参考我之前线下做过的分享 [2]
[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#state-backend-rocksdb-memory-partitioned-index-filters
[2]
[1]. Please note that all RocksDB instances within
same slot would share the same block cache, they will report same usage.
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-metrics-block-cache-usage
Best
Yun Tang
Hi
一般是卡在最后一步从JM写checkpoint meta上面了,建议使用jstack等工具检查一下JM的cpu栈,看问题出在哪里。
祝好
唐云
From: Sun.Zhu <17626017...@163.com>
Sent: Tuesday, March 8, 2022 14:12
To: user-zh@flink.apache.org
Subject: Re:flink s3 checkpoint 一直IN_PROGRESS(100%)直到失败
图挂了
-docs-release-1.13/docs/ops/monitoring/back_pressure/
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/custom_serialization/
Best,
Yun Tang
From: Vidya Sagar Mula
Sent: Sunday, March 6, 2022 4:16
To: Yun Tang
Cc
o vs
> Avro)
>From our experience, kryo is not a good choice in most cases.
Best
Yun Tang
From: Vidya Sagar Mula
Sent: Friday, March 4, 2022 17:00
To: user
Subject: Incremental checkpointing & RocksDB Serialization
Hi,
I have a cluster that c
/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-metrics-block-cache-usage
Best
Yun Tang
From: Alexandre Montecucco
Sent: Friday, February 25, 2022 20:14
To: user
Subject: Pods are OOMKilled with RocksDB backend after a few checkpoints
Hi,
这个需求在社区里面称之为 state bootstrapping, 以前在state processor API没有引入时,还有第三方的工具 bravo
[1]。
我理解你的需求完全可以有state processor API完成,生成一个savepoint,由新作业消费。目前社区也在考虑支持生成native
savepoint,用以加快生成速度 [2]
[1] https://github.com/king/bravo
[2] https://issues.apache.org/jira/browse/FLINK-25528
Best
Yun Tang
Hi Alex,
I think the better solution is to know what the problem you have ever met when
restoring the timers?
Flink does not support to remove state (including timer state) currently.
Best
Yun Tang
From: Alex Drobinsky
Sent: Monday, February 7, 2022 21:09
https://github.com/apache/flink/blob/90e850301e672fc0da293abc55eb446f7ec68ffa/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java#L540
[2] https://issues.apache.org/jira/browse/FLINK-17073
Best
Yun Tang
From: Robert Metzger
Hi Jasmin,
>From my knowledge, it seems no big company would adopt pure file system source
>as the main data source of Flink. We would in general choose a message queue,
>e.g Kafka, as the data source.
Best
Yun Tang
From: Jasmin Redžepović
Sent:
Hi Siddhesh,
The root cause is that the configuration of group.id is missing for the Flink
program. The configuration of restart strategy has no relationship with this.
I think you should pay your attention to kafka related configurations.
Best
Yun Tang
From
- ASF
JIRA<https://issues.apache.org/jira/browse/FLINK-14105>
As the consensus among our community(please link dedicated thread if there is)
we keep in mind that flink-runtime will be eventually scala-free. It is because
of ...
issues.apache.org
Best
Yun Tang
___
-sst-files-size[1] to know the total sst files on disks of each rocksDB.
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/config/#state-backend-rocksdb-metrics-total-sst-files-size
Best
Yun Tang
From: Abdul Rahman
Sent: Saturday,
Hi Singh,
All the output operator transformed by AllWindowedStream would be
SingleOutputStreamOperator, which cannot be parallel.
[1]
https://github.com/apache/flink/blob/master/flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/AllWindowedStream.scala
Best
Yun Tang
+1 for dropping the MapR Fs.
Best
Yun Tang
From: Till Rohrmann
Sent: Wednesday, January 5, 2022 18:33
To: Martijn Visser
Cc: David Morávek ; dev ; Seth Wiesman
; User
Subject: Re: [DISCUSS] Deprecate MapR FS
+1 for dropping the MapR FS.
Cheers,
Till
On Wed
/#using-operator-state
Best,
Yun Tang
From: Krzysztof Chmielewski
Sent: Thursday, December 23, 2021 6:32
To: user
Subject: Operator state in New Source API
Hi,
Is it possible to use managed operator state like MapState in an implementation
of new unified source
Hi,
如果你们可以自己实现一套SQL语句到jobgraph的预编译转换IDE,然后在IDE中可以手动配置jobgraph每个算子的配置,应该是可以达到你们的目的
(可能还需要结合细粒度调度模式)。
祝好
唐云
From: gygz...@163.com
Sent: Thursday, December 9, 2021 16:14
To: user-zh
Subject: 回复: Re: flink sql支持细粒度的状态配置
Hi Yun Tang
感谢你的回复,我们在调研的过程中也发现,正如你所说的生成的
Hi 你好,
我认为这是一个很好的需求,对于data stream以及python API来说,state
TTL都是通过API逐个配置的,你的需求就可以直接满足。但是对于SQL来说,由于相同的SQL语句,不同优化器其生成的执行plan可能会差异很大,很难对某个operator内的state进行TTL进行配置,可能一种方式是增加一些SQL的优化hint,对于你示例中的join语句和groupBy
的count语句配以不同的TTL,但是目前Flink SQL尚未支持该功能。
祝好
唐云
From:
Great news!
Thanks for all the guys who contributed in this project.
Best
Yun Tang
On 2021/11/30 16:30:52 Till Rohrmann wrote:
> Great news, Yingjie. Thanks a lot for sharing this information with the
> community and kudos to all the contributors of the external shuffle service
> :-)
&g
Hi Yang,
Flink keeps the max key groups the same no matter how parallelism changes, and
use this to avoid state data lost [1]
[1] https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html
Best
Yun Tang
On 2021/11/26 10:07:29 Nicolaus Weidner wrote:
> Hi,
>
> to res
Hi
checkpoint 以及 savepoint是否可以生效取决于相关source的实现,Kafka这种是支持replay非常好的source,至于file
reader,目前 split file reader [1] 相关的实现是支持 容错的
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/sources/#the-split-reader-api
祝好
唐云
From: lei-tian
Hi,
具体问题建议直接在相关ticket上进行讨论,邮件列表上可能相关人士没有注意到。
祝好
唐云
From: 不许人间见白头
Sent: Wednesday, November 10, 2021 22:28
To: user-zh
Subject: MongoDB sink
你好,
请问一下,关于New feature: FLINK-24477 预计什么时候创建PR呢?
each time.
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-log-dir
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-log-level
Best
Yun Tang
Hi
可以使用jstack,async profiler [1]
等工具勘察一下checkpoint期间的CPU栈。oss需要先写本地再上传,确实可能CPU消耗多一些,但是明显高很多有一些超出预期。
[1] https://github.com/jvm-profiling-tools/async-profiler
祝好
唐云
From: Lei Wang
Sent: Tuesday, October 19, 2021 14:01
To: user-zh@flink.apache.org
Subject: Re:
Hi,
先问个版本问题,你的Flink版本是1.3 而不是1.13?
Checkpoint里面会存储timer,所以重启之后会触发窗口的计算,但确实这种一天的窗口累计有点太多了,除非你的作业存在比较严重的反压,导致checkpoint内积攒了大量没有触发的timer。
祝好
唐云
From: claylin <1012539...@qq.com.INVALID>
Sent: Friday, October 29, 2021 11:33
To: user-zh
Subject:
Thanks for Chesnay & Martijn and everyone who made this release happen.
Best
Yun Tang
From: JING ZHANG
Sent: Friday, October 22, 2021 10:17
To: dev
Cc: Martijn Visser ; Jingsong Li
; Chesnay Schepler ; user
Subject: Re: [ANNOUNCE] Apache Flink 1.13.3 rele
, it will pick the registered kryo
serializer only for checkpoint/restore. Since java-based state backend would
not deep copy key-values for performance reasons, it might be changed
unexpectedly if user misused, which might make the field reset to default value.
Best,
Yun Tang
could read the transient
field back if using FileSystemStateBackend.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/serialization/types_serialization/#flinks-typeinformation-class
Best
Yun Tang
From: Alex Drobinsky
Sent: Monday, October
mechanism to remove
useless files, which is triggered by internal compaction. You should not care
too much on the checkpointed data size as your job consuming more and more
records, moreover the increasing size is actually quite small (from 1.32GB to
1.34GB).
Best
Yun Tang
t; 写道:
>我司基于最新提供流批一体的接口,实现了mongodb的连接器,支持source和sink,实现了控制批量插入频率、控制缓存批的数据量和mongo文档对象和java对象转换,同时还可选择批量更新,并且使用的mongo最新异步驱动,后期我还会不断优化性能,看大佬能否推动一下,把这个连接器贡献给社区
>
>
>-- 原始邮件 ------
>发件人: "Yun Tang";
>发件时间: 2021-09-22 10:55
>收件人: "user-z
Hi,
其实目前Flink社区并不是那么欢迎新增官方支持的connector,主要原因就是社区的开发人员有限,没有精力维护太多的connector,尤其是一些connector的实现需要一定的相关背景,但很难保证review代码的开发人员具有相关背景,毕竟大家都需要为自己approve的代码负责。
你可以在 flink-packages [1] 里面找一下,或者考虑自己实现并维护(基础实现应该是复杂度不高的)。
[1] https://flink-packages.org/
祝好
唐云
From: 黑色
/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L99
[2] https://issues.apache.org/jira/browse/FLINK-24149
Best
Yun Tang
From: Robin Cassan
Sent: Tuesday, September 7, 2021 20:17
To: Yun Tang
Cc: Robert Metzger ; user
Subject: Re: Cleaning old
read option which disable fillCache [2] to speedup bulk scan in the future to
help improve the performance.
Best
Yun Tang
[1] https://github.com/jvm-profiling-tools/async-profiler
[2]
https://javadoc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/ReadOptions.html#setFillCache(boolean
solution to provide the ability to re-upload
all files under some specific configured option so that we could let new job
get decoupled with older checkpoints. Do you think that could resolve your case?
Best
Yun Tang
From: Robin Cassan
Sent: Wednesday, September
ir=oss://bucket-logcenter/flink-state/flink-session-recovery
\
-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.2.jar
\
-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.2.jar
-----邮件原件-
发件人: Yun Tang
发送时间: 2021年8月30日 11:36
收件人:
Hi,
你好,图片无法加载,可以直接粘贴文字出来
祝好
唐云
From: dker eandei
Sent: Friday, August 27, 2021 14:58
To: user-zh@flink.apache.org
Subject: flink oss ha
您好:
看文档OSS可以用作 FsStatebackend,那么Flink on k8s
做高可用时,high-availability.storageDir可以配置成oss吗,我试了下,报以下错误:
Hi 航飞
可以参照[1] 看是不是类似的问题
[1] https://issues.apache.org/jira/browse/FLINK-23721
祝好
唐云
From: 李航飞
Sent: Thursday, August 26, 2021 19:02
To: user-zh
Subject: table.exec.state.ttl
Configuration conf = new Configuration();
.
[1] https://shipilev.net/jvm/anatomy-quarks/12-native-memory-tracking/
[2] https://gist.github.com/thomasdarimont/79b3cef01e5786210309
Best
Yun Tang
From: Pranjul Ahuja
Sent: Monday, August 16, 2021 13:10
To: user@flink.apache.org
Subject: JobManager Resident
names.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/libs/state_processor_api/#state-processor-api
[2]
https://github.com/apache/flink/blob/0d2b945729df8f0a0149d02ca24633ae52a1ef21/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L99
Best
Yun
are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522
We would like to thank all contributors of the Apache Flink community who made
this release possible!
Regards,
Yun Tang
are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522
We would like to thank all contributors of the Apache Flink community who made
this release possible!
Regards,
Yun Tang
.
Best
Yun Tang
From: Jerome Li
Sent: Friday, August 6, 2021 7:57
To: user@flink.apache.org
Subject: Best Practice of Using HashSet State
Hi,
I am new to Flink and state backend. I find Flink does provide ValueState,
ListState, and MapState. But it does
Hi Sandeep,
If you set the flink-statebackend-rocksdb as provided scope, it should not
include the org.rocksdb classes, have you ever checked your application jar
package directly just as what I described?
Best
Yun Tang
From: Sandeep khanzode
Sent: Friday
Hi
Flink-1.13.2 的jar包正在同步到给个maven仓库,顺利的话,明天就可以正式announce了。
祝好
唐云
From: Jingsong Li
Sent: Wednesday, August 4, 2021 16:56
To: user-zh
Subject: Re: 1.14啥时候出呀
1.14还有1-2个月
1.13.2马上就出了,估计明天或后天或周一
On Wed, Aug 4, 2021 at 4:48 PM yidan zhao wrote:
>
has different flink version with
servers.
Best,
Yun Tang
From: Stephan Ewen
Sent: Wednesday, August 4, 2021 19:10
To: Yun Tang
Cc: Sandeep khanzode ; user
Subject: Re: Bloom Filter - RocksDB - LinkageError Classloading
@Yun Tang Does it make sense to add RocksDB
] https://github.com/facebook/rocksdb/tree/master/java/jmh
Best,
Yun Tang
From: Piotr Nowojski
Sent: Thursday, August 5, 2021 2:01
To: Yuval Itzchakov
Cc: Yun Tang ; Nico Kruber ;
user@flink.apache.org ; dev
Subject: Re: [ANNOUNCE] RocksDB Version Upgrade
/frocksdb/pull/19
[13] https://github.com/facebook/rocksdb/pull/5441/
[14] https://github.com/facebook/rocksdb/pull/2283
Best,
Yun Tang
On Wed, Aug 4, 2021 at 2:36 PM Yuval Itzchakov
mailto:yuva...@gmail.com>> wrote:
We are heavy users of RocksDB and have had several issues with memory m
added
the dependency of org.rocksdb:rocksdbjni in your pom).
Best
Yun Tang
From: Sandeep khanzode
Sent: Wednesday, August 4, 2021 11:54
To: user
Subject: Bloom Filter - RocksDB - LinkageError Classloading
Hello,
I tried to add the bloom filter functionality
/CheckpointCoordinator.java#L1226
Best
Yun Tang
From: Manong Karl
Sent: Wednesday, August 4, 2021 9:17
To: Harsh Shah
Cc: user@flink.apache.org
Subject: Re: Flink k8 HA mode + checkpoint management
Can You please share your configs? I'm using native kubernetes without HA
Hi Mason,
I think this request is reasonable and you could create a JIRA ticket so that
we could resolve it later.
Best,
Yun Tang
From: Mason Chen
Sent: Tuesday, July 27, 2021 15:15
To: Yun Tang
Cc: Mason Chen ; user@flink.apache.org
Subject: Re
Hi Mason,
In rocksDB, one state is corresponding to a column family and we could
aggregate all RocksDB native metrics per column family. If my understanding is
right, are you hoping that all state latency metrics for a particular state
could be aggregated per state level?
Best
Yun Tang
目前Flink社区版RocksDB尚不支持ARM架构机器。使用RocksDB的话,内存均是堆外管理,与JVM的堆上内存无关。
另外,有个题外话,你们是云上产品还是自建了ARM集群?有点好奇目前国内的ARM集群使用率情况。
祝好
唐云
From: Wanghui (HiCampus)
Sent: Thursday, July 15, 2021 11:33
To: user-zh@flink.apache.org
Subject: Re: flink大窗口性能问题
我在aarch64 + jre
Hi,
这个看上去是client触发savepoint失败,而不是savepoint本身end-to-end执行超时。建议对照一下JobManager的日志,观察在触发的时刻,JM日志里是否有触发savepoint的相关日志,也可以在flink
web UI上观察相应的savepoint是否出现在checkpoint tab的历史里面。
祝好
唐云
From: 仙剑……情动人间 <1510603...@qq.com.INVALID>
Sent: Tuesday, July 13, 2021 17:31
To:
Hi
只要enable了checkpoint,一定会生成checkpoint的,这与你的运行模式无关。可以检查一下日志,看看JM端是否正常触发了checkpoint
祝好
唐云
From: casel.chen
Sent: Tuesday, June 29, 2021 9:55
To: user-zh@flink.apache.org
Subject: local运行模式下不会生成checkpoint吗?
我在本地使用local运行模式运行flink sql,将数据从kafka写到mongodb,mongodb
Hi,
有可能的,如果下游发生了死锁,无法消费任何数据的话,整个作业就假死了。要定位root cause还是需要查一下下游的task究竟怎么了
祝好
唐云
From: Ada Luna
Sent: Tuesday, July 6, 2021 12:04
To: user-zh@flink.apache.org
Subject: Re: Flink 1.10 内存问题
反压会导致整个Flink任务假死吗?一条Kafka数据都不消费了。持续几天,不重启不恢复的
Yun Tang 于2021年7月6日周二 上午11
Hi,
LocalBufferPool.requestMemorySegment
这个方法并不是在申请内存,而是因为作业存在反压,因为下游没有及时消费,相关buffer被占用,所以上游会卡在requestMemorySegment上面。
想要解决还是查一下为什么下游会反压。
祝好
唐云
From: Ada Luna
Sent: Tuesday, July 6, 2021 10:43
To: user-zh@flink.apache.org
Subject: Re: Flink 1.10 内存问题
://github.com/apache/flink/commit/b332ce40d88be84d9cf896f446c7c6e26dbc8b6a#diff-54e9ce3b15d6badcc9376ab144df066eb46c4e516d6ee31ef8eb38e2d4359042
Best
Yun Tang
From: Matthias Pohl
Sent: Thursday, July 1, 2021 16:41
To: tao xiao
Cc: Yun Tang ; user ; Roman
Khachatryan
Subject
containing
'FromElementsFunctionT' has ever been completed.
Best
Yun Tang
From: tao xiao
Sent: Saturday, June 26, 2021 16:40
To: user
Subject: Re: Exception in snapshotState suppresses subsequent checkpoints
Btw here is the checkpoint related log
[2021-06-26 16:08
JVM.
Some tools provided by memory allocator such jemalloc or tcmalloc, could help
find how much the memory usage via OS malloc. Even though, there still exists
some memory used by mmap or on local stack, which is not so easy to detect.
Best
Yun Tang
From: Guowei
-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L158-L159
Best,
Yun Tang
From: Vijayendra Yadav
Sent: Wednesday, June 23, 2021 11:02
To: user
Subject: High Flink checkpoint Size
Hi Team,
I
/flink/api/common/typeinfo/TypeInformation.java#L208
Best
Yun Tang
From: Rommel Holmes
Sent: Wednesday, June 23, 2021 13:43
To: user
Subject: PoJo to Avro Serialization throw KryoException:
java.lang.UnsupportedOperationException
My Unit test was running OK under
Hi Kevin,
欢迎来到Apache Flink开源社区!
因为开源社区的工作,一些参与者很多时候都是工作时间之外参与的,可能难免遇到进度更新不及时,或者长时间不再活跃的问题。
非常欢迎您在相关JIRA
ticket下面评论和申请权限创建PR,社区一直都欢迎每一位贡献者,对于文档的维护尤其是中文文档的翻译也是非常需要的,如果有任何想要贡献的部分,欢迎直接去JIRA
ticket下面、github PR下面评论,或者直接创建相关ticket。
祝好
唐云
From: pang fan
Sent:
Hi Yidan,
1. 是否我从FileStateBackend切换到RocksDB其实性能也不会降低很多呢?
2. 这个涉及到RocksDB这种LSM架构的DB读写路径了,即使逻辑数据量可以全部存储在内存中,由于RocksDB的write
buffer默认会存储相同key的不同value,而且checkpoint时候仍然会触发flush,很难避免数据落盘,数据落盘之后的读路径肯定没有Flink的内存state
backend性能好,二者性能还是有些差异的,不过实际生产中可能不需要 FsStateBackend 那么高的性能。
1.
Hi Chen-Che,
The PR-16177 [1] is the documentation for state access latency tracking,
thought it has not been merged, you could still refer it for more details.
[1] https://github.com/apache/flink/pull/16177
Best
Yun Tang
From: Chen-Che Huang
Sent: Friday
Hi,
你可以参照社区的 state-evolution的 E2E 测试代码 [1], 整个程序就是使用的avro作为相关类的声明工具。
[1]
https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-state-evolution-test/src/main
祝好
唐云
From: casel.chen
Sent: Friday, June 11, 2021 8:13
To:
thread stack.
[1] https://github.com/jvm-profiling-tools/async-profiler
Best
Yun Tang
From: Robert Metzger
Sent: Thursday, June 17, 2021 14:11
To: Padarn Wilson
Cc: JING ZHANG ; user
Subject: Re: RocksDB CPU resource usage
If you are able to execute your job loc
Hi Jiang,
Please take a look at FLINK-17860 and FLINK-13856 for previous discussion of
this problem.
[1] https://issues.apache.org/jira/browse/FLINK-17860
[2] https://issues.apache.org/jira/browse/FLINK-13856
Best
Yun Tang
From: Guowei Ma
Sent: Wednesday
state size will be.
Moreover, the OOM should not be related to RocksDB as it used off-heap native
memory, and you might need some work to dig what occupied the JVM memory during
checkpoints.
Best
Yun Tang
From: McBride, Chris
Sent: Saturday, June 5, 2021 3:17
hi,
本质上来说,你的做法有点hack其实不推荐,如果非要这么做的话,你还可以通过 numRestarts [1] 的指标来看重启了多少次。
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#availability
祝好
唐云
From: yidan zhao
Sent: Friday, June 4, 2021 11:52
To: user-zh
Subject: Re: 关于flink
Hi,
没有被引用的文件可能也不是完全无用的,可能是当前pending checkpoint正在上传的,所以还需要比较一下那些不在checkpoint
meta内的文件的修改时间戳,可能比你分析的complete checkpoint的时间戳要大。
总体上来说,我不认为这个问题是一个bug,这个是LSM架构的DB的空间放大问题。如果你对空间放大非常担心的话,可以启用 dynamic level [1]
来严格控制空间放大,不过这个可能会影响写放大和读放大,导致性能受到一定影响。
[1]
Hi,
增量checkpoint上传的是sst文件本身,里面可能有一部分空间是被无用数据占据的,你可以理解成增量checkpoint上传的是受到空间放大影响的RocksDB的数据,如果因为单机的数据量较小,没有及时触发compaction的话,确实存在整个远程checkpoint目录数据大于当前实际空间的情况。而关闭增量checkpoint,上传的其实是与savepoint格式一样的kv数据对,Flink会遍历整个DB,将目前有效的数据写出到远程。所以你关闭增量checkpoint,而发现checkpoint目录保持恒定大小的话,说明真实有效数据的空间是稳定的。
Hi,
先确定一下,你的 idleStateRetention 是 3600秒?其次,要想看是否所有数据均有用,可以利用
Checkpoints.loadCheckpointMeta [1] 去加载你所保留的checkpoint目录下的 _metadata
文件,然后与当前checkpoint目录下的文件作对比,看是否存在大量的未删除旧文件。
目前仅凭你的描述和一段SQL代码其实很难判断。
可能存在的原因有:
1. 单次checkpoint文件数目过多,JM单点删除跟不上相关速度
2. 整体checkpoint
-alignment-timeout
Best
Yun Tang
From: Senhong Liu
Sent: Monday, May 31, 2021 10:33
To: JING ZHANG
Cc: Kai Fu ; user
Subject: Re: Dynamic configuration of Flink checkpoint interval
Hi all,
In fact, a pretty similar JIRA has been created, which is
https
customized serializer for those classes. Another
better solution is to ensure the class backwards compatibility with customized
serializer or leverage apache avro.
You could refer to [1] for more details.
[1]
https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html
Best
Yun
Hi
BaseRowSerializer 已经在Flink-1.11 时候改名成 RowDataSerializer了,即使用
state-processor-API 也没办法处理当前不存在的类,可能有种比较复杂的办法是自己把 BaseRowSerializer
的类不改变package的情况下拷贝出来,然后用 state-processor-API 将相关类强制转换成
RowDataSerializer,不过考虑到你的job graph都是SQL生成的,state-processor-API面向地更多的是data
stream
Thanks for Dawid and Guowei's great work, and thanks for everyone involved for
this release.
Best
Yun Tang
From: Xintong Song
Sent: Thursday, May 6, 2021 12:08
To: user ; dev
Subject: Re: [ANNOUNCE] Apache Flink 1.13.0 released
Thanks Dawid & Gu
Hi,
你可以参阅文档 [1] :
由于 RocksDB 的 JNI API 构建在 byte[] 数据结构之上, 所以每个 key 和 value 最大支持 2^31 字节。 重要信息:
RocksDB 合并操作的状态(例如:ListState)累积数据量大小可以超过 2^31 字节,但是会在下一次获取数据时失败。这是当前 RocksDB
JNI 的限制。
[1]
/c688bf3c83e72155ccf5d04fe397b7c0a1274fd1/flink-tests/src/test/java/org/apache/flink/test/checkpointing/SavepointITCase.java#L438
Best
Yun Tang
From: Abdullah bin Omar
Sent: Tuesday, May 4, 2021 11:50
To: user@flink.apache.org
Subject: savepoint command in code
Hello,
I am trying
Hi Dan,
You could refer to the "Fix Versions" in FLINK-16753 [1] and know that this bug
is resolved after 1.11.3 not 1.11.1.
[1] https://issues.apache.org/jira/browse/FLINK-16753
Best
Yun Tang
From: Dan Hill
Sent: Tuesday, April 27, 2021 7:50
To: Yu
Hi Dan,
I think you might use older version of Flink and this problem has been resolved
by FLINK-16753 [1] after Flink-1.10.3.
[1] https://issues.apache.org/jira/browse/FLINK-16753
Best
Yun Tang
From: Robert Metzger
Sent: Monday, April 26, 2021 14:46
To: Dan
able里面实际kv状态
---原始邮件---
发件人:Yun Tang
Hi
snapshotState主要是给operator state用的,异常原因是keyed state
访问时需要设置currentKey的,但是currentKey是当前正在处理的record的key,与snapshotState的执行时候的语义不一样,执行snapshotState方法的时候,是可以没有当前record的。
如果想要访问整个keyed state,可以通过 KeyedStateBackend#getKeys(String state, N namespace)
来访问,但还是不建议将keyed
1 - 100 of 530 matches
Mail list logo