[ANNOUNCE] Apache Flink 1.17.2 released

2023-11-28 Thread Yun Tang
The Apache Flink community is very happy to announce the release of Apache Flink 1.17.2, which is the second bugfix release for the Apache Flink 1.17 series. Apache Flink® Is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink

[ANNOUNCE] Apache Flink 1.17.2 released

2023-11-28 Thread Yun Tang
The Apache Flink community is very happy to announce the release of Apache Flink 1.17.2, which is the second bugfix release for the Apache Flink 1.17 series. Apache Flink® Is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Yun Tang
Congratulations! Unlike other data-lakes, Paimon might be the first one to act as a stream-first (not batch-first) data-lake. Best Yun Tang From: Xianxun Ye Sent: Tuesday, March 28, 2023 10:52 To: d...@flink.apache.org Cc: Yu Li ; user ; user-zh ; d

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Yun Tang
Congratulations! Unlike other data-lakes, Paimon might be the first one to act as a stream-first (not batch-first) data-lake. Best Yun Tang From: Xianxun Ye Sent: Tuesday, March 28, 2023 10:52 To: d...@flink.apache.org Cc: Yu Li ; user ; user-zh ; d

Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0 released

2023-01-30 Thread Yun Tang
Thanks Yuanfei for driving the frocksdb release! Best Yun Tang From: Yuan Mei Sent: Tuesday, January 31, 2023 15:09 To: Jing Ge Cc: Yanfei Lei ; d...@flink.apache.org ; user ; user-zh@flink.apache.org Subject: Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0

Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0 released

2023-01-30 Thread Yun Tang
Thanks Yuanfei for driving the frocksdb release! Best Yun Tang From: Yuan Mei Sent: Tuesday, January 31, 2023 15:09 To: Jing Ge Cc: Yanfei Lei ; d...@flink.apache.org ; user ; user...@flink.apache.org Subject: Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0

Re: Flink CEP Incremental Checkpoint Issue

2022-10-22 Thread Yun Tang
care about the checkpoint size too much. Instead, we should care more about the output results. Best Yun Tang From: Martijn Visser Sent: Wednesday, October 19, 2022 22:03 To: Puneet Duggal Cc: user Subject: Re: Flink CEP Incremental Checkpoint Issue Hi, Given

Re: ExecutionMode in ExecutionConfig

2022-09-16 Thread Yun Tang
va/org/apache/flink/streaming/examples/wordcount/WordCount.java#L98 Best Yun Tang From: zhanghao.c...@outlook.com Sent: Thursday, September 15, 2022 0:03 To: Hailu, Andreas ; user@flink.apache.org Subject: Re: ExecutionMode in ExecutionConfig It's added in Flin

Re: [FEEDBACK] Metadata Platforms / Catalogs / Lineage integration

2022-09-12 Thread Yun Tang
acks rich information on >FlinkJobListener just as Feng mentioned, which has been supported well by >Spark, to send data lineage to external systems. [1] https://feature-requests.datahubproject.io/p/flink-integration Best Yun Tang From: wangqinghuan <

Re: 怀疑源码中的一个方法是never reached code

2022-06-14 Thread Yun Tang
Hi,育锋 我觉得你的分析应该是没问题的。可以创建一个ticket来修复该问题。另外,关于代码实现的具体讨论,建议在dev邮件列表讨论。 祝好 唐云 From: 朱育锋 Sent: Tuesday, June 14, 2022 19:33 To: user-zh@flink.apache.org Subject: 怀疑源码中的一个方法是never reached code Hello Everyone

Re: NegativeArraySizeException trying to take a savepoint

2022-06-14 Thread Yun Tang
/iterator.cc#L239-L245 Best Yun Tang From: Martijn Visser Sent: Monday, June 13, 2022 21:47 To: Mike Barborak Cc: user@flink.apache.org Subject: Re: NegativeArraySizeException trying to take a savepoint Hi Mike, It would be worthwhile to check if this still occurs

Re: 1.13.5版本sql大小64k限制bug

2022-05-25 Thread Yun Tang
Hi 请使用英文在dev社区发送邮件。另外关于使用方面的问题,建议向user-zh 频道发送,已经帮你转发到相关邮件列表了。 祝好 唐云 From: Lose control ./ <286296...@qq.com.INVALID> Sent: Tuesday, May 24, 2022 9:15 To: dev Subject: 1.13.5版本sql大小64k限制bug 请问各位大神,1.13.5版本sql大小64k限制如何修改啊?谢谢

Re: RocksDB's state size discrepancy with what's seen with state processor API

2022-05-18 Thread Yun Tang
/frocksdb/blob/8608d75d85f8e1b3b64b73a4fb6d19baec61ba5c/java/src/main/java/org/rocksdb/DBOptions.java#L520 Best Yun Tang From: Alexis Sarda-Espinosa Sent: Tuesday, May 3, 2022 8:47 To: Peter Brucia Cc: user@flink.apache.org Subject: RE: RocksDB's state size discrepancy

Re: [Discuss] Creating an Apache Flink slack workspace

2022-05-11 Thread Yun Tang
to set up our own slack workspace. Best Yun Tang From: Jingsong Li Sent: Thursday, May 12, 2022 10:49 To: Xintong Song Cc: dev ; user Subject: Re: [Discuss] Creating an Apache Flink slack workspace Hi all, Regarding using ASF slack. I share the problems I saw

Re: Failed to restore from ck, because of KryoException

2022-05-10 Thread Yun Tang
the compatibility problem, we also suggest you to use customized serializers for your customized class for better performance. Best Yun Tang From: Liting Liu (litiliu) Sent: Friday, May 6, 2022 10:20 To: user@flink.apache.org Subject: Failed to restore from ck, because

Re: RocksDB efficiency and keyby

2022-04-21 Thread Yun Tang
ttps://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-memory-partitioned-index-filters Best Yun Tang From: Yaroslav Tkachenko Sent: Thursday, April 21, 2022 0:44 To: Trystan Cc: user Subject: Re: RocksDB effici

Re: RocksDB state not cleaned up

2022-04-08 Thread Yun Tang
this is really useful for Flink as this could push data to the last level, which leads to increase the read amplification. [1] https://javadoc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.html Best Yun Tang From: Alexis Sarda-Espinosa Sent: Friday, April 8

Re: flink pipeline handles very small amount of messages in a second (only 500)

2022-04-07 Thread Yun Tang
whether the sink back-pressured the source to impact the throughput of source. Last but not least, did your source already have 100% CPU usage, which means your source operator has already reached to its highest throughput. Best Yun Tang From: Sigalit Eliazov

Re: RocksDB 读 cpu 100% 如何调优

2022-03-21 Thread Yun Tang
Hi, RocksDB 的CPU栈能卡在100%,很有可能是大量解压缩 index/filter block导致的,可以enable partition index/filter [1] 看看问题是否解决。 相关内容也可以参考我之前线下做过的分享 [2] [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#state-backend-rocksdb-memory-partitioned-index-filters [2]

Re: RocksDB metrics for effective memory consumption

2022-03-16 Thread Yun Tang
[1]. Please note that all RocksDB instances within same slot would share the same block cache, they will report same usage. [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-metrics-block-cache-usage Best Yun Tang

Re: Re:flink s3 checkpoint 一直IN_PROGRESS(100%)直到失败

2022-03-08 Thread Yun Tang
Hi 一般是卡在最后一步从JM写checkpoint meta上面了,建议使用jstack等工具检查一下JM的cpu栈,看问题出在哪里。 祝好 唐云 From: Sun.Zhu <17626017...@163.com> Sent: Tuesday, March 8, 2022 14:12 To: user-zh@flink.apache.org Subject: Re:flink s3 checkpoint 一直IN_PROGRESS(100%)直到失败 图挂了

Re: Incremental checkpointing & RocksDB Serialization

2022-03-06 Thread Yun Tang
-docs-release-1.13/docs/ops/monitoring/back_pressure/ [2] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/custom_serialization/ Best, Yun Tang From: Vidya Sagar Mula Sent: Sunday, March 6, 2022 4:16 To: Yun Tang Cc

Re: Incremental checkpointing & RocksDB Serialization

2022-03-04 Thread Yun Tang
o vs > Avro) >From our experience, kryo is not a good choice in most cases. Best Yun Tang From: Vidya Sagar Mula Sent: Friday, March 4, 2022 17:00 To: user Subject: Incremental checkpointing & RocksDB Serialization Hi, I have a cluster that c

Re: Pods are OOMKilled with RocksDB backend after a few checkpoints

2022-02-28 Thread Yun Tang
/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-metrics-block-cache-usage Best Yun Tang From: Alexandre Montecucco Sent: Friday, February 25, 2022 20:14 To: user Subject: Pods are OOMKilled with RocksDB backend after a few checkpoints

Re: 状态初始化

2022-02-27 Thread Yun Tang
Hi, 这个需求在社区里面称之为 state bootstrapping, 以前在state processor API没有引入时,还有第三方的工具 bravo [1]。 我理解你的需求完全可以有state processor API完成,生成一个savepoint,由新作业消费。目前社区也在考虑支持生成native savepoint,用以加快生成速度 [2] [1] https://github.com/king/bravo [2] https://issues.apache.org/jira/browse/FLINK-25528 Best Yun Tang

Re: How to prevent check pointing of timers ?

2022-02-07 Thread Yun Tang
Hi Alex, I think the better solution is to know what the problem you have ever met when restoring the timers? Flink does not support to remove state (including timer state) currently. Best Yun Tang From: Alex Drobinsky Sent: Monday, February 7, 2022 21:09

Re: Inaccurate checkpoint trigger time

2022-01-30 Thread Yun Tang
https://github.com/apache/flink/blob/90e850301e672fc0da293abc55eb446f7ec68ffa/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java#L540 [2] https://issues.apache.org/jira/browse/FLINK-17073 Best Yun Tang From: Robert Metzger

Re: Reading performance - Kafka VS FileSystem

2022-01-26 Thread Yun Tang
Hi Jasmin, >From my knowledge, it seems no big company would adopt pure file system source >as the main data source of Flink. We would in general choose a message queue, >e.g Kafka, as the data source. Best Yun Tang From: Jasmin Redžepović Sent:

Re: Failure Restart Strategy leads to error

2022-01-26 Thread Yun Tang
Hi Siddhesh, The root cause is that the configuration of group.id is missing for the Flink program. The configuration of restart strategy has no relationship with this. I think you should pay your attention to kafka related configurations. Best Yun Tang From

Re: Is Scala the best language for Flink?

2022-01-24 Thread Yun Tang
- ASF JIRA<https://issues.apache.org/jira/browse/FLINK-14105> As the consensus among our community(please link dedicated thread if there is) we keep in mind that flink-runtime will be eventually scala-free. It is because of ... issues.apache.org  Best Yun Tang ___

Re: Question about MapState size

2022-01-23 Thread Yun Tang
-sst-files-size[1] to know the total sst files on disks of each rocksDB. [1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/config/#state-backend-rocksdb-metrics-total-sst-files-size Best Yun Tang From: Abdul Rahman Sent: Saturday,

Re: Apache Flink - Can AllWindowedStream be parallel ?

2022-01-23 Thread Yun Tang
Hi Singh, All the output operator transformed by AllWindowedStream would be SingleOutputStreamOperator, which cannot be parallel. [1] https://github.com/apache/flink/blob/master/flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/AllWindowedStream.scala Best Yun Tang

Re: [DISCUSS] Deprecate MapR FS

2022-01-09 Thread Yun Tang
+1 for dropping the MapR Fs. Best Yun Tang From: Till Rohrmann Sent: Wednesday, January 5, 2022 18:33 To: Martijn Visser Cc: David Morávek ; dev ; Seth Wiesman ; User Subject: Re: [DISCUSS] Deprecate MapR FS +1 for dropping the MapR FS. Cheers, Till On Wed

Re: Operator state in New Source API

2021-12-22 Thread Yun Tang
/#using-operator-state Best, Yun Tang From: Krzysztof Chmielewski Sent: Thursday, December 23, 2021 6:32 To: user Subject: Operator state in New Source API Hi, Is it possible to use managed operator state like MapState in an implementation of new unified source

Re: Re: flink sql支持细粒度的状态配置

2021-12-09 Thread Yun Tang
Hi, 如果你们可以自己实现一套SQL语句到jobgraph的预编译转换IDE,然后在IDE中可以手动配置jobgraph每个算子的配置,应该是可以达到你们的目的 (可能还需要结合细粒度调度模式)。 祝好 唐云 From: gygz...@163.com Sent: Thursday, December 9, 2021 16:14 To: user-zh Subject: 回复: Re: flink sql支持细粒度的状态配置 Hi Yun Tang 感谢你的回复,我们在调研的过程中也发现,正如你所说的生成的

Re: flink sql支持细粒度的状态配置

2021-12-08 Thread Yun Tang
Hi 你好, 我认为这是一个很好的需求,对于data stream以及python API来说,state TTL都是通过API逐个配置的,你的需求就可以直接满足。但是对于SQL来说,由于相同的SQL语句,不同优化器其生成的执行plan可能会差异很大,很难对某个operator内的state进行TTL进行配置,可能一种方式是增加一些SQL的优化hint,对于你示例中的join语句和groupBy 的count语句配以不同的TTL,但是目前Flink SQL尚未支持该功能。 祝好 唐云 From:

Re: [ANNOUNCE] Open source of remote shuffle project for Flink batch data processing

2021-11-30 Thread Yun Tang
Great news! Thanks for all the guys who contributed in this project. Best Yun Tang On 2021/11/30 16:30:52 Till Rohrmann wrote: > Great news, Yingjie. Thanks a lot for sharing this information with the > community and kudos to all the contributors of the external shuffle service > :-) &g

Re: Will Flink loss some old Keyed State when changing the parallelism

2021-11-26 Thread Yun Tang
Hi Yang, Flink keeps the max key groups the same no matter how parallelism changes, and use this to avoid state data lost [1] [1] https://flink.apache.org/features/2017/07/04/flink-rescalable-state.html Best Yun Tang On 2021/11/26 10:07:29 Nicolaus Weidner wrote: > Hi, > > to res

Re: 检查点和保存点

2021-11-12 Thread Yun Tang
Hi checkpoint 以及 savepoint是否可以生效取决于相关source的实现,Kafka这种是支持replay非常好的source,至于file reader,目前 split file reader [1] 相关的实现是支持 容错的 [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/sources/#the-split-reader-api 祝好 唐云 From: lei-tian

Re: MongoDB sink

2021-11-10 Thread Yun Tang
Hi, 具体问题建议直接在相关ticket上进行讨论,邮件列表上可能相关人士没有注意到。 祝好 唐云 From: 不许人间见白头 Sent: Wednesday, November 10, 2021 22:28 To: user-zh Subject: MongoDB sink 你好, 请问一下,关于New feature: FLINK-24477 预计什么时候创建PR呢?

Re: savepoint.readKeyedState hangs on org.rocksdb.RocksDB.disposeInternal

2021-10-31 Thread Yun Tang
each time. [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-log-dir [2] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-log-level Best Yun Tang

Re: flink 以阿里云 oss 作为 checkpoint cpu 过高

2021-10-29 Thread Yun Tang
Hi 可以使用jstack,async profiler [1] 等工具勘察一下checkpoint期间的CPU栈。oss需要先写本地再上传,确实可能CPU消耗多一些,但是明显高很多有一些超出预期。 [1] https://github.com/jvm-profiling-tools/async-profiler 祝好 唐云 From: Lei Wang Sent: Tuesday, October 19, 2021 14:01 To: user-zh@flink.apache.org Subject: Re:

Re: 关于作业失败从checkpoint重启,触发了过期的窗口计算

2021-10-29 Thread Yun Tang
Hi, 先问个版本问题,你的Flink版本是1.3 而不是1.13? Checkpoint里面会存储timer,所以重启之后会触发窗口的计算,但确实这种一天的窗口累计有点太多了,除非你的作业存在比较严重的反压,导致checkpoint内积攒了大量没有触发的timer。 祝好 唐云 From: claylin <1012539...@qq.com.INVALID> Sent: Friday, October 29, 2021 11:33 To: user-zh Subject:

Re: [ANNOUNCE] Apache Flink 1.13.3 released

2021-10-22 Thread Yun Tang
Thanks for Chesnay & Martijn and everyone who made this release happen. Best Yun Tang From: JING ZHANG Sent: Friday, October 22, 2021 10:17 To: dev Cc: Martijn Visser ; Jingsong Li ; Chesnay Schepler ; user Subject: Re: [ANNOUNCE] Apache Flink 1.13.3 rele

Re: Reset of transient variables in state to default values.

2021-10-20 Thread Yun Tang
, it will pick the registered kryo serializer only for checkpoint/restore. Since java-based state backend would not deep copy key-values for performance reasons, it might be changed unexpectedly if user misused, which might make the field reset to default value. Best, Yun Tang

Re: Reset of transient variables in state to default values.

2021-10-12 Thread Yun Tang
could read the transient field back if using FileSystemStateBackend. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/serialization/types_serialization/#flinks-typeinformation-class Best Yun Tang From: Alex Drobinsky Sent: Monday, October

Re: Checkpoint size increasing even i enable increasemental checkpoint

2021-10-12 Thread Yun Tang
mechanism to remove useless files, which is triggered by internal compaction. You should not care too much on the checkpointed data size as your job consuming more and more records, moreover the increasing size is actually quite small (from 1.32GB to 1.34GB). Best Yun Tang

Re: Re:re:Re: 回复:Flink SQL官方何时能支持redis和mongodb连接器?

2021-09-23 Thread Yun Tang
t; 写道: >我司基于最新提供流批一体的接口,实现了mongodb的连接器,支持source和sink,实现了控制批量插入频率、控制缓存批的数据量和mongo文档对象和java对象转换,同时还可选择批量更新,并且使用的mongo最新异步驱动,后期我还会不断优化性能,看大佬能否推动一下,把这个连接器贡献给社区 > > >-- 原始邮件 ------ >发件人: "Yun Tang"; >发件时间: 2021-09-22 10:55 >收件人: "user-z

Re: 回复:Flink SQL官方何时能支持redis和mongodb连接器?

2021-09-21 Thread Yun Tang
Hi, 其实目前Flink社区并不是那么欢迎新增官方支持的connector,主要原因就是社区的开发人员有限,没有精力维护太多的connector,尤其是一些connector的实现需要一定的相关背景,但很难保证review代码的开发人员具有相关背景,毕竟大家都需要为自己approve的代码负责。 你可以在 flink-packages [1] 里面找一下,或者考虑自己实现并维护(基础实现应该是复杂度不高的)。 [1] https://flink-packages.org/ 祝好 唐云 From: 黑色

Re: Cleaning old incremental checkpoint files

2021-09-17 Thread Yun Tang
/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L99 [2] https://issues.apache.org/jira/browse/FLINK-24149 Best Yun Tang From: Robin Cassan Sent: Tuesday, September 7, 2021 20:17 To: Yun Tang Cc: Robert Metzger ; user Subject: Re: Cleaning old

Re: State processor API very slow reading a keyed state with RocksDB

2021-09-09 Thread Yun Tang
read option which disable fillCache [2] to speedup bulk scan in the future to help improve the performance. Best Yun Tang [1] https://github.com/jvm-profiling-tools/async-profiler [2] https://javadoc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/ReadOptions.html#setFillCache(boolean

Re: Cleaning old incremental checkpoint files

2021-09-03 Thread Yun Tang
solution to provide the ability to re-upload all files under some specific configured option so that we could let new job get decoupled with older checkpoints. Do you think that could resolve your case? Best Yun Tang From: Robin Cassan Sent: Wednesday, September

Re: flink oss ha

2021-08-30 Thread Yun Tang
ir=oss://bucket-logcenter/flink-state/flink-session-recovery \ -Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.2.jar \ -Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS=flink-oss-fs-hadoop-1.13.2.jar -----邮件原件- 发件人: Yun Tang 发送时间: 2021年8月30日 11:36 收件人:

Re: flink oss ha

2021-08-29 Thread Yun Tang
Hi, 你好,图片无法加载,可以直接粘贴文字出来 祝好 唐云 From: dker eandei Sent: Friday, August 27, 2021 14:58 To: user-zh@flink.apache.org Subject: flink oss ha 您好: 看文档OSS可以用作 FsStatebackend,那么Flink on k8s 做高可用时,high-availability.storageDir可以配置成oss吗,我试了下,报以下错误:

Re: table.exec.state.ttl

2021-08-29 Thread Yun Tang
Hi 航飞 可以参照[1] 看是不是类似的问题 [1] https://issues.apache.org/jira/browse/FLINK-23721 祝好 唐云 From: 李航飞 Sent: Thursday, August 26, 2021 19:02 To: user-zh Subject: table.exec.state.ttl Configuration conf = new Configuration();

Re: JobManager Resident memory Always Increasing

2021-08-16 Thread Yun Tang
. [1] https://shipilev.net/jvm/anatomy-quarks/12-native-memory-tracking/ [2] https://gist.github.com/thomasdarimont/79b3cef01e5786210309 Best Yun Tang From: Pranjul Ahuja Sent: Monday, August 16, 2021 13:10 To: user@flink.apache.org Subject: JobManager Resident

Re: Inspecting SST state of rocksdb

2021-08-09 Thread Yun Tang
names. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/libs/state_processor_api/#state-processor-api [2] https://github.com/apache/flink/blob/0d2b945729df8f0a0149d02ca24633ae52a1ef21/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L99 Best Yun

[ANNOUNCE] Apache Flink 1.13.2 released

2021-08-05 Thread Yun Tang
are available in Jira: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522 We would like to thank all contributors of the Apache Flink community who made this release possible! Regards, Yun Tang

[ANNOUNCE] Apache Flink 1.13.2 released

2021-08-05 Thread Yun Tang
are available in Jira: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350218==12315522 We would like to thank all contributors of the Apache Flink community who made this release possible! Regards, Yun Tang

Re: Best Practice of Using HashSet State

2021-08-05 Thread Yun Tang
. Best Yun Tang From: Jerome Li Sent: Friday, August 6, 2021 7:57 To: user@flink.apache.org Subject: Best Practice of Using HashSet State Hi, I am new to Flink and state backend. I find Flink does provide ValueState, ListState, and MapState. But it does

Re: Bloom Filter - RocksDB - LinkageError Classloading

2021-08-05 Thread Yun Tang
Hi Sandeep, If you set the flink-statebackend-rocksdb as provided scope, it should not include the org.rocksdb classes, have you ever checked your application jar package directly just as what I described? Best Yun Tang From: Sandeep khanzode Sent: Friday

Re: 1.14啥时候出呀

2021-08-05 Thread Yun Tang
Hi Flink-1.13.2 的jar包正在同步到给个maven仓库,顺利的话,明天就可以正式announce了。 祝好 唐云 From: Jingsong Li Sent: Wednesday, August 4, 2021 16:56 To: user-zh Subject: Re: 1.14啥时候出呀 1.14还有1-2个月 1.13.2马上就出了,估计明天或后天或周一 On Wed, Aug 4, 2021 at 4:48 PM yidan zhao wrote: >

Re: Bloom Filter - RocksDB - LinkageError Classloading

2021-08-05 Thread Yun Tang
has different flink version with servers. Best, Yun Tang From: Stephan Ewen Sent: Wednesday, August 4, 2021 19:10 To: Yun Tang Cc: Sandeep khanzode ; user Subject: Re: Bloom Filter - RocksDB - LinkageError Classloading @Yun Tang Does it make sense to add RocksDB

Re: [ANNOUNCE] RocksDB Version Upgrade and Performance

2021-08-05 Thread Yun Tang
] https://github.com/facebook/rocksdb/tree/master/java/jmh Best, Yun Tang From: Piotr Nowojski Sent: Thursday, August 5, 2021 2:01 To: Yuval Itzchakov Cc: Yun Tang ; Nico Kruber ; user@flink.apache.org ; dev Subject: Re: [ANNOUNCE] RocksDB Version Upgrade

Re: [ANNOUNCE] RocksDB Version Upgrade and Performance

2021-08-04 Thread Yun Tang
/frocksdb/pull/19 [13] https://github.com/facebook/rocksdb/pull/5441/ [14] https://github.com/facebook/rocksdb/pull/2283 Best, Yun Tang On Wed, Aug 4, 2021 at 2:36 PM Yuval Itzchakov mailto:yuva...@gmail.com>> wrote: We are heavy users of RocksDB and have had several issues with memory m

Re: Bloom Filter - RocksDB - LinkageError Classloading

2021-08-04 Thread Yun Tang
added the dependency of org.rocksdb:rocksdbjni in your pom). Best Yun Tang From: Sandeep khanzode Sent: Wednesday, August 4, 2021 11:54 To: user Subject: Bloom Filter - RocksDB - LinkageError Classloading Hello, I tried to add the bloom filter functionality

Re: Flink k8 HA mode + checkpoint management

2021-08-03 Thread Yun Tang
/CheckpointCoordinator.java#L1226 Best Yun Tang From: Manong Karl Sent: Wednesday, August 4, 2021 9:17 To: Harsh Shah Cc: user@flink.apache.org Subject: Re: Flink k8 HA mode + checkpoint management Can You please share your configs? I'm using native kubernetes without HA

Re: as-variable configuration for state ac

2021-07-27 Thread Yun Tang
Hi Mason, I think this request is reasonable and you could create a JIRA ticket so that we could resolve it later. Best, Yun Tang From: Mason Chen Sent: Tuesday, July 27, 2021 15:15 To: Yun Tang Cc: Mason Chen ; user@flink.apache.org Subject: Re

Re: as-variable configuration for state ac

2021-07-26 Thread Yun Tang
Hi Mason, In rocksDB, one state is corresponding to a column family and we could aggregate all RocksDB native metrics per column family. If my understanding is right, are you hoping that all state latency metrics for a particular state could be aggregated per state level? Best Yun Tang

Re: flink大窗口性能问题

2021-07-16 Thread Yun Tang
目前Flink社区版RocksDB尚不支持ARM架构机器。使用RocksDB的话,内存均是堆外管理,与JVM的堆上内存无关。 另外,有个题外话,你们是云上产品还是自建了ARM集群?有点好奇目前国内的ARM集群使用率情况。 祝好 唐云 From: Wanghui (HiCampus) Sent: Thursday, July 15, 2021 11:33 To: user-zh@flink.apache.org Subject: Re: flink大窗口性能问题 我在aarch64 + jre

Re: flink 触发保存点失败

2021-07-13 Thread Yun Tang
Hi, 这个看上去是client触发savepoint失败,而不是savepoint本身end-to-end执行超时。建议对照一下JobManager的日志,观察在触发的时刻,JM日志里是否有触发savepoint的相关日志,也可以在flink web UI上观察相应的savepoint是否出现在checkpoint tab的历史里面。 祝好 唐云 From: 仙剑……情动人间 <1510603...@qq.com.INVALID> Sent: Tuesday, July 13, 2021 17:31 To:

Re: local运行模式下不会生成checkpoint吗?

2021-07-09 Thread Yun Tang
Hi 只要enable了checkpoint,一定会生成checkpoint的,这与你的运行模式无关。可以检查一下日志,看看JM端是否正常触发了checkpoint 祝好 唐云 From: casel.chen Sent: Tuesday, June 29, 2021 9:55 To: user-zh@flink.apache.org Subject: local运行模式下不会生成checkpoint吗? 我在本地使用local运行模式运行flink sql,将数据从kafka写到mongodb,mongodb

Re: Flink 1.10 内存问题

2021-07-06 Thread Yun Tang
Hi, 有可能的,如果下游发生了死锁,无法消费任何数据的话,整个作业就假死了。要定位root cause还是需要查一下下游的task究竟怎么了 祝好 唐云 From: Ada Luna Sent: Tuesday, July 6, 2021 12:04 To: user-zh@flink.apache.org Subject: Re: Flink 1.10 内存问题 反压会导致整个Flink任务假死吗?一条Kafka数据都不消费了。持续几天,不重启不恢复的 Yun Tang 于2021年7月6日周二 上午11

Re: Flink 1.10 内存问题

2021-07-05 Thread Yun Tang
Hi, LocalBufferPool.requestMemorySegment 这个方法并不是在申请内存,而是因为作业存在反压,因为下游没有及时消费,相关buffer被占用,所以上游会卡在requestMemorySegment上面。 想要解决还是查一下为什么下游会反压。 祝好 唐云 From: Ada Luna Sent: Tuesday, July 6, 2021 10:43 To: user-zh@flink.apache.org Subject: Re: Flink 1.10 内存问题

Re: Exception in snapshotState suppresses subsequent checkpoints

2021-07-01 Thread Yun Tang
://github.com/apache/flink/commit/b332ce40d88be84d9cf896f446c7c6e26dbc8b6a#diff-54e9ce3b15d6badcc9376ab144df066eb46c4e516d6ee31ef8eb38e2d4359042 Best Yun Tang From: Matthias Pohl Sent: Thursday, July 1, 2021 16:41 To: tao xiao Cc: Yun Tang ; user ; Roman Khachatryan Subject

Re: Exception in snapshotState suppresses subsequent checkpoints

2021-06-28 Thread Yun Tang
containing 'FromElementsFunctionT' has ever been completed. Best Yun Tang From: tao xiao Sent: Saturday, June 26, 2021 16:40 To: user Subject: Re: Exception in snapshotState suppresses subsequent checkpoints Btw here is the checkpoint related log [2021-06-26 16:08

Re: Metric for JVM Overhaed

2021-06-25 Thread Yun Tang
JVM. Some tools provided by memory allocator such jemalloc or tcmalloc, could help find how much the memory usage via OS malloc. Even though, there still exists some memory used by mmap or on local stack, which is not so easy to detect. Best Yun Tang From: Guowei

Re: High Flink checkpoint Size

2021-06-23 Thread Yun Tang
-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L158-L159 Best, Yun Tang From: Vijayendra Yadav Sent: Wednesday, June 23, 2021 11:02 To: user Subject: High Flink checkpoint Size Hi Team, I

Re: PoJo to Avro Serialization throw KryoException: java.lang.UnsupportedOperationException

2021-06-23 Thread Yun Tang
/flink/api/common/typeinfo/TypeInformation.java#L208 Best Yun Tang From: Rommel Holmes Sent: Wednesday, June 23, 2021 13:43 To: user Subject: PoJo to Avro Serialization throw KryoException: java.lang.UnsupportedOperationException My Unit test was running OK under

Re: 中文教程更新不及时问题

2021-06-22 Thread Yun Tang
Hi Kevin, 欢迎来到Apache Flink开源社区! 因为开源社区的工作,一些参与者很多时候都是工作时间之外参与的,可能难免遇到进度更新不及时,或者长时间不再活跃的问题。 非常欢迎您在相关JIRA ticket下面评论和申请权限创建PR,社区一直都欢迎每一位贡献者,对于文档的维护尤其是中文文档的翻译也是非常需要的,如果有任何想要贡献的部分,欢迎直接去JIRA ticket下面、github PR下面评论,或者直接创建相关ticket。 祝好 唐云 From: pang fan Sent:

Re: rocksdb对比filestatebackend

2021-06-22 Thread Yun Tang
Hi Yidan, 1. 是否我从FileStateBackend切换到RocksDB其实性能也不会降低很多呢? 2. 这个涉及到RocksDB这种LSM架构的DB读写路径了,即使逻辑数据量可以全部存储在内存中,由于RocksDB的write buffer默认会存储相同key的不同value,而且checkpoint时候仍然会触发flush,很难避免数据落盘,数据落盘之后的读路径肯定没有Flink的内存state backend性能好,二者性能还是有些差异的,不过实际生产中可能不需要 FsStateBackend 那么高的性能。 1.

Re: How to set state.backend.rocksdb.latency-track-enabled

2021-06-18 Thread Yun Tang
Hi Chen-Che, The PR-16177 [1] is the documentation for state access latency tracking, thought it has not been merged, you could still refer it for more details. [1] https://github.com/apache/flink/pull/16177 Best Yun Tang From: Chen-Che Huang Sent: Friday

Re: Flink state evolution with avro

2021-06-17 Thread Yun Tang
Hi, 你可以参照社区的 state-evolution的 E2E 测试代码 [1], 整个程序就是使用的avro作为相关类的声明工具。 [1] https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-state-evolution-test/src/main 祝好 唐云 From: casel.chen Sent: Friday, June 11, 2021 8:13 To:

Re: RocksDB CPU resource usage

2021-06-17 Thread Yun Tang
thread stack. [1] https://github.com/jvm-profiling-tools/async-profiler Best Yun Tang From: Robert Metzger Sent: Thursday, June 17, 2021 14:11 To: Padarn Wilson Cc: JING ZHANG ; user Subject: Re: RocksDB CPU resource usage If you are able to execute your job loc

Re: Discard checkpoint files through a single recursive call

2021-06-15 Thread Yun Tang
Hi Jiang, Please take a look at FLINK-17860 and FLINK-13856 for previous discussion of this problem. [1] https://issues.apache.org/jira/browse/FLINK-17860 [2] https://issues.apache.org/jira/browse/FLINK-13856 Best Yun Tang From: Guowei Ma Sent: Wednesday

Re: Question about State TTL and Interval Join

2021-06-06 Thread Yun Tang
state size will be. Moreover, the OOM should not be related to RocksDB as it used off-heap native memory, and you might need some work to dig what occupied the JVM memory during checkpoints. Best Yun Tang From: McBride, Chris Sent: Saturday, June 5, 2021 3:17

Re: 关于flink sql的kafka source的开始消费offset相关问题。

2021-06-06 Thread Yun Tang
hi, 本质上来说,你的做法有点hack其实不推荐,如果非要这么做的话,你还可以通过 numRestarts [1] 的指标来看重启了多少次。 [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/metrics/#availability 祝好 唐云 From: yidan zhao Sent: Friday, June 4, 2021 11:52 To: user-zh Subject: Re: 关于flink

Re: Flink Sql 的/checkpoint/shared/文件夹大小不断增长,源数据没有数据激增,应该如何控制?

2021-06-02 Thread Yun Tang
Hi, 没有被引用的文件可能也不是完全无用的,可能是当前pending checkpoint正在上传的,所以还需要比较一下那些不在checkpoint meta内的文件的修改时间戳,可能比你分析的complete checkpoint的时间戳要大。 总体上来说,我不认为这个问题是一个bug,这个是LSM架构的DB的空间放大问题。如果你对空间放大非常担心的话,可以启用 dynamic level [1] 来严格控制空间放大,不过这个可能会影响写放大和读放大,导致性能受到一定影响。 [1]

Re: Flink Sql 的/checkpoint/shared/文件夹大小不断增长,源数据没有数据激增,应该如何控制?

2021-06-01 Thread Yun Tang
Hi, 增量checkpoint上传的是sst文件本身,里面可能有一部分空间是被无用数据占据的,你可以理解成增量checkpoint上传的是受到空间放大影响的RocksDB的数据,如果因为单机的数据量较小,没有及时触发compaction的话,确实存在整个远程checkpoint目录数据大于当前实际空间的情况。而关闭增量checkpoint,上传的其实是与savepoint格式一样的kv数据对,Flink会遍历整个DB,将目前有效的数据写出到远程。所以你关闭增量checkpoint,而发现checkpoint目录保持恒定大小的话,说明真实有效数据的空间是稳定的。

Re: Flink Sql 的/checkpoint/shared/文件夹大小不断增长,源数据没有数据激增,应该如何控制?

2021-05-31 Thread Yun Tang
Hi, 先确定一下,你的 idleStateRetention 是 3600秒?其次,要想看是否所有数据均有用,可以利用 Checkpoints.loadCheckpointMeta [1] 去加载你所保留的checkpoint目录下的 _metadata 文件,然后与当前checkpoint目录下的文件作对比,看是否存在大量的未删除旧文件。 目前仅凭你的描述和一段SQL代码其实很难判断。 可能存在的原因有: 1. 单次checkpoint文件数目过多,JM单点删除跟不上相关速度 2. 整体checkpoint

Re: Dynamic configuration of Flink checkpoint interval

2021-05-30 Thread Yun Tang
-alignment-timeout Best Yun Tang From: Senhong Liu Sent: Monday, May 31, 2021 10:33 To: JING ZHANG Cc: Kai Fu ; user Subject: Re: Dynamic configuration of Flink checkpoint interval Hi all, In fact, a pretty similar JIRA has been created, which is https

Re: Error restarting job from Savepoint

2021-05-30 Thread Yun Tang
customized serializer for those classes. Another better solution is to ensure the class backwards compatibility with customized serializer or leverage apache avro. You could refer to [1] for more details. [1] https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html Best Yun

Re: Flink upgraded to version 1.12.0 and started from SavePoint to report an error

2021-05-19 Thread Yun Tang
Hi BaseRowSerializer 已经在Flink-1.11 时候改名成 RowDataSerializer了,即使用 state-processor-API 也没办法处理当前不存在的类,可能有种比较复杂的办法是自己把 BaseRowSerializer 的类不改变package的情况下拷贝出来,然后用 state-processor-API 将相关类强制转换成 RowDataSerializer,不过考虑到你的job graph都是SQL生成的,state-processor-API面向地更多的是data stream

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-06 Thread Yun Tang
Thanks for Dawid and Guowei's great work, and thanks for everyone involved for this release. Best Yun Tang From: Xintong Song Sent: Thursday, May 6, 2021 12:08 To: user ; dev Subject: Re: [ANNOUNCE] Apache Flink 1.13.0 released Thanks Dawid & Gu

Re: 流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB

2021-05-06 Thread Yun Tang
Hi, 你可以参阅文档 [1] : 由于 RocksDB 的 JNI API 构建在 byte[] 数据结构之上, 所以每个 key 和 value 最大支持 2^31 字节。 重要信息: RocksDB 合并操作的状态(例如:ListState)累积数据量大小可以超过 2^31 字节,但是会在下一次获取数据时失败。这是当前 RocksDB JNI 的限制。 [1]

Re: savepoint command in code

2021-05-05 Thread Yun Tang
/c688bf3c83e72155ccf5d04fe397b7c0a1274fd1/flink-tests/src/test/java/org/apache/flink/test/checkpointing/SavepointITCase.java#L438 Best Yun Tang From: Abdullah bin Omar Sent: Tuesday, May 4, 2021 11:50 To: user@flink.apache.org Subject: savepoint command in code Hello, I am trying

Re: Checkpoint error - "The job has failed"

2021-04-28 Thread Yun Tang
Hi Dan, You could refer to the "Fix Versions" in FLINK-16753 [1] and know that this bug is resolved after 1.11.3 not 1.11.1. [1] https://issues.apache.org/jira/browse/FLINK-16753 Best Yun Tang From: Dan Hill Sent: Tuesday, April 27, 2021 7:50 To: Yu

Re: Checkpoint error - "The job has failed"

2021-04-26 Thread Yun Tang
Hi Dan, I think you might use older version of Flink and this problem has been resolved by FLINK-16753 [1] after Flink-1.10.3. [1] https://issues.apache.org/jira/browse/FLINK-16753 Best Yun Tang From: Robert Metzger Sent: Monday, April 26, 2021 14:46 To: Dan

Re: 回复:Re: CheckpointedFunction#snapshotState访问键控状态报错

2021-04-13 Thread Yun Tang
able里面实际kv状态 ---原始邮件--- 发件人:Yun Tang

Re: CheckpointedFunction#snapshotState访问键控状态报错

2021-04-10 Thread Yun Tang
Hi snapshotState主要是给operator state用的,异常原因是keyed state 访问时需要设置currentKey的,但是currentKey是当前正在处理的record的key,与snapshotState的执行时候的语义不一样,执行snapshotState方法的时候,是可以没有当前record的。 如果想要访问整个keyed state,可以通过 KeyedStateBackend#getKeys(String state, N namespace) 来访问,但还是不建议将keyed

  1   2   3   4   5   6   >