This has been not moved for a while so assigned to you.
G
On Mon, Jul 15, 2024 at 9:06 AM Zhongyou Lee
wrote:
> Hellow everyone :
>
> Up to now, To adjuest rocksdb flush thread the only way is implement
> ConfigurableRocksDBOptionsFactory #setMaxBackgroundFlushes by user. I f
Hi Banupriya,
Sometimes a sst will not be compacted and will be referenced for a long
time. That depends on how rocksdb picks the files for compaction. It may
happen when some range of keys is never touched at some point of time,
since the rocksdb only takes care of the files or key range
flink job with RMQ Source, filters, tumbling window(uses
> processing time fires every 2s), aggregator, RMQ Sink. Enabled incremental
> rocksdb checkpoints for every 10s with minimum pause between checkpoints as
> 5s. My checkpoints size is keep on increasing , so I am planning to t
Hi All,
I have a flink job with RMQ Source, filters, tumbling window(uses
processing time fires every 2s), aggregator, RMQ Sink. Enabled incremental
rocksdb checkpoints for every 10s with minimum pause between checkpoints as
5s. My checkpoints size is keep on increasing , so I am planning to tune
Hellow everyone :
Up to now, To adjuest rocksdb flush thread the only way is implement
ConfigurableRocksDBOptionsFactory #setMaxBackgroundFlushes by user. I found
FLINK-22059 to solve this problem. The pr has never been executed, i want to
finish this pr. Can anyone assignee this pr
Hi,
> 1. After multiple full checkpoints and a NATIVE savepoint the size was
> unchanged. I'm wondering if RocksDb compaction is because we never update
> key values? The state is nearly fully composed of keys' space. Do keys not
> get freed using RocksDb compaction filter for TTL
in
the map state which is monotonic per partition key. Map state was chosen
over list state in the hope that we can manage a sliding window using TTL.
Using RocksDB incremental checkpointing, the app runs very well despite the
large total checkpoint size. Our current checkpoint size is 3.2TB.
We have
或许您可以尝试参考下[1] 再验证下加载的问题。
BTW,目前看起来是有些依赖库找不到,librocksdbjni-win64.dll 当时是基于 VS2022
编译出来的,您也尝试下在本地安装下VS2022后重试。
[1] https://github.com/facebook/rocksdb/issues/2531#issuecomment-313209314
ha.fen...@aisino.com 于2024年5月7日周二 10:22写道:
>
> idea工具,win10操作系统
> java.io.IOException: Could
请问是什么开发环境呢? windows吗?
可以分享一下更详细的报错吗?比如.dll 找不到
ha.fen...@aisino.com 于2024年5月7日周二 09:34写道:
>
> Configuration config = new Configuration();
> config.set(StateBackendOptions.STATE_BACKEND, "rocksdb");
> config.set(CheckpointingOptions.CHECKPOINT_STORAGE, "
share the related rocksdb log which
> may contain more detailed info ?
>
> On Fri, Apr 12, 2024 at 12:49 PM Lei Wang wrote:
>
>>
>> I enable RocksDB native metrics and do some performance tuning.
>>
>> state.backend.rocksdb.block.cache-size is set to 128m,4 slots f
Hi, Lei.
It's indeed a bit confusing. Could you share the related rocksdb log which
may contain more detailed info ?
On Fri, Apr 12, 2024 at 12:49 PM Lei Wang wrote:
>
> I enable RocksDB native metrics and do some performance tuning.
>
> state.backend.rocksdb.block.cache-size is s
I enable RocksDB native metrics and do some performance tuning.
state.backend.rocksdb.block.cache-size is set to 128m,4 slots for each
TaskManager.
The observed result for one specific parallel slot:
state.backend.rocksdb.metrics.block-cache-capacity is about 14.5M
> *To:* Zhanghao Chen
> *Cc:* Biao Geng ; user
> *Subject:* Re: How to enable RocksDB native metrics?
>
> Hi Zhanghao,
>
> flink run -m yarn-cluster -ys 4 -ynm EventCleaning_wl -yjm 2G -ytm 16G
> -yqu default -p 8 -yDstate.backend.latency-track.keyed-state-ena
Add a space between -yD and the param should do the trick.
Best,
Zhanghao Chen
From: Lei Wang
Sent: Thursday, April 11, 2024 19:40
To: Zhanghao Chen
Cc: Biao Geng ; user
Subject: Re: How to enable RocksDB native metrics?
Hi Zhanghao,
flink run -m yarn-cluster
tyled CLI for YARN jobs where "-yD" instead of "-D"
> should be used.
> --
> *From:* Lei Wang
> *Sent:* Thursday, April 11, 2024 12:39
> *To:* Biao Geng
> *Cc:* user
> *Subject:* Re: How to enable RocksDB native metrics?
&g
Hi Lei,
You are using an old-styled CLI for YARN jobs where "-yD" instead of "-D"
should be used.
From: Lei Wang
Sent: Thursday, April 11, 2024 12:39
To: Biao Geng
Cc: user
Subject: Re: How to enable RocksDB native metrics?
Hi Biao,
I
Biao Geng
>
> Marco Villalobos 于2024年4月8日周一 09:22写道:
>
>> Hi Lei,
>>
>> Have you tried enabling these Flink configuration properties?
>>
>> Configuration
>> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-n
t;>
>> Configuration
>> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics>
>> nightlies.apache.org
>> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metri
ils.
Best,
Biao Geng
Marco Villalobos 于2024年4月8日周一 09:22写道:
> Hi Lei,
>
> Have you tried enabling these Flink configuration properties?
>
> Configuration
> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics&g
at 9:22 AM Marco Villalobos
wrote:
> Hi Lei,
>
> Have you tried enabling these Flink configuration properties?
>
> Configuration
> <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#rocksdb-native-metrics>
> nightlies.apache.org
>
Sun, Apr 7, 2024 at 4:59 PM Zakelly Lan <zakelly@gmail.com> wrote:Hi Lei,You can enable it by some configurations listed in: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#rocksdb-native-metrics (RocksDB Native Metrics)Best,ZakellyOn Sun, Apr 7, 202
/flink/flink-docs-release-1.14/docs/deployment/config/#rocksdb-native-metrics
> (RocksDB Native Metrics)
>
>
> Best,
> Zakelly
>
> On Sun, Apr 7, 2024 at 4:59 PM Zakelly Lan wrote:
>
>> Hi Lei,
>>
>> You can enable it by some configurations listed in:
>>
Hi Lei,
You can enable it by some configurations listed in:
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#rocksdb-native-metrics
(RocksDB Native Metrics)
Best,
Zakelly
On Sun, Apr 7, 2024 at 4:59 PM Zakelly Lan wrote:
> Hi Lei,
>
> You c
You can take a look at the document. [
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#rocksdb-native-metrics
]
Thanks,
Zbz
> 2024年4月7日 13:41,Lei Wang 写道:
>
>
> Using big state and want to do some performance tuning, how can i enable
>
Using big state and want to do some performance tuning, how can i enable
RocksDB native metrics?
I am using Flink 1.14.4
Thanks,
Lei
ions. I know that Flink provides a set of window-based operators
> with time-based semantics and tumbling/sliding windows.
>
> By reading the Flink documentation, I understand that there is the
> possibility to change the memory backend utilized for storing the in-flight
> state of the o
Hi Gabriele,
The keyed state APIs (ValueState、ListState、etc) are supported by all
types of state backend (hashmap、rocksdb、etc.). And the built-in window
operators are implemented with these state APIs internally. So you can use
these built-in operators/functions with the RocksDB state backend
Hi Gabriele,
Quick answer: You can use the built-in window operators which have been
integrated with state backends including RocksDB.
Thanks,
Zakelly
On Tue, Mar 5, 2024 at 10:33 AM Zhanghao Chen
wrote:
> Hi Gabriele,
>
> I'd recommend extending the existing window function
be satisfied with the
reduce/aggregate function pattern, which is important for large windows.
Best,
Zhanghao Chen
From: Gabriele Mencagli
Sent: Monday, March 4, 2024 19:38
To: user@flink.apache.org
Subject: Question about time-based operators with RocksDB backend
that there is the
possibility to change the memory backend utilized for storing the
in-flight state of the operators. For example, using RocksDB for this
purpose to cope with a larger-than-memory state. If I am not wrong, to
transparently change the backend (e.g., from in-memory to RocksDB) we
have to use
e performance here.
>
>
> Best,
> Zakelly
>
> On Sun, Feb 18, 2024 at 7:42 PM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hi Zakelly,
>>
>> thanks for the information, that's interesting. Would you say that
>> reading a subset from
, 2024 at 7:42 PM Alexis Sarda-Espinosa <
sarda.espin...@gmail.com> wrote:
> Hi Zakelly,
>
> thanks for the information, that's interesting. Would you say that reading
> a subset from RocksDB is fast enough to be pretty much negligible, or could
> it be a bottleneck if the state
Hi Zakelly,
thanks for the information, that's interesting. Would you say that reading
a subset from RocksDB is fast enough to be pretty much negligible, or could
it be a bottleneck if the state of each key is "large"? Again assuming the
number of distinct partition keys is large
Hi Alexis,
Flink does need some heap memory to bridge requests to rocksdb and gather
the results. In most cases, the memory is discarded immediately (eventually
collected by GC). In case of timers, flink do cache a limited subset of
key-values in heap to improve performance.
In general you don't
Hello Alexis,
I don't think data in RocksDB resides in JVM even with function calls.
For more details, check the link below:
https://github.com/facebook/rocksdb/wiki/RocksDB-Overview#3-high-level-architecture
RocksDB has three main components - memtable, sstfile and WAL(not used in
Flink
Hi Asimansu
The memory RocksDB manages is outside the JVM, yes, but the mentioned
subsets must be bridged to the JVM somehow so that the data can be exposed
to the functions running inside Flink, no?
Regards,
Alexis.
On Thu, 15 Feb 2024, 14:06 Asimansu Bera, wrote:
> Hello Ale
Hello Alexis,
RocksDB resides off-heap and outside of JVM. The small subset of data ends
up on the off-heap in the memory.
For more details, check the following link:
https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/memory/mem_setup_tm/#managed-memory
I hope
Hello,
Most info regarding RocksDB memory for Flink focuses on what's needed
independently of the JVM (although the Flink process configures its limits
and so on). I'm wondering if there are additional special considerations
with regards to the JVM heap in the following scenario.
Assuming a key
> 2024年1月18日 14:59,fufu 写道:
>
> 看hdfs上shard文件比chk-xxx要大很多。
>
>
>
> 在 2024-01-18 14:49:14,"fufu" 写道:
>
> 是datastream作业,窗口算子本身没有设置TTL,其余算子设置了TTL,是在Flink
> UI上看到窗口算子的size不断增大,一天能增加个600~800M,持续不断的增大。以下图为例:ID为313的cp比ID为304的大了将近10M,一直运行,会一直这么增加下去。cp文件和
ize不断增大,一天能增加个600~800M,持续不断的增大。以下图为例:ID为313的cp比ID为304的大了将近10M,一直运行,会一直这么增加下去。cp文件和rocksdb文件正在看~
>
> 在 2024-01-18 10:56:51,"Zakelly Lan" 写道:
>
> >你好,能提供一些详细的信息吗,比如:是datastream作业吧?是否设置了State
> >TTL?观测到逐渐变大是通过checkpoint监控吗,总量是什么级别。cp文件或者本地rocksdb目录下哪些文件最大
> >
>
看hdfs上shard文件比chk-xxx要大很多。
在 2024-01-18 14:49:14,"fufu" 写道:
是datastream作业,窗口算子本身没有设置TTL,其余算子设置了TTL,是在Flink
UI上看到窗口算子的size不断增大,一天能增加个600~800M,持续不断的增大。以下图为例:ID为313的cp比ID为304的大了将近10M,一直运行,会一直这么增加下去。cp文件和rocksdb文件正在看~
在 2024-01-18 10:56:51,"Zakelly Lan" 写道:
>你好,能提供一些详细的
是datastream作业,窗口算子本身没有设置TTL,其余算子设置了TTL,是在Flink
UI上看到窗口算子的size不断增大,一天能增加个600~800M,持续不断的增大。以下图为例:ID为313的cp比ID为304的大了将近10M,一直运行,会一直这么增加下去。cp文件和rocksdb文件正在看~
在 2024-01-18 10:56:51,"Zakelly Lan" 写道:
>你好,能提供一些详细的信息吗,比如:是datastream作业吧?是否设置了State
>TTL?观测到逐渐变大是通过checkpoint监控吗,总量是什么级别。cp
你好,能提供一些详细的信息吗,比如:是datastream作业吧?是否设置了State
TTL?观测到逐渐变大是通过checkpoint监控吗,总量是什么级别。cp文件或者本地rocksdb目录下哪些文件最大
On Wed, Jan 17, 2024 at 4:09 PM fufu wrote:
>
> 我有一个Flink任务,使用的是flink1.14.6版本,任务中有一个增量(AggregateFunction)+全量(ProcessWindowFunction)的窗口,任务运行的时候这个算子的状态在不断增大,每天能增大个几百M这种,这个问题怎么排查?使用的事件时间,
我有一个Flink任务,使用的是flink1.14.6版本,任务中有一个增量(AggregateFunction)+全量(ProcessWindowFunction)的窗口,任务运行的时候这个算子的状态在不断增大,每天能增大个几百M这种,这个问题怎么排查?使用的事件时间,水位线下发正常,其余的算子都正常,就这个算子在不断增长,非常诡异。在网上搜到一个类似的文章:https://blog.csdn.net/RL_LEEE/article/details/123864487,想尝试下,但不知道manifest大小如何设置,没有找到对应的参数,
请社区指导下,或者有没有别的解决方案?感谢社区!
Hi,
IIUC, yes.
--
Best!
Xuyang
在 2023-12-04 15:13:56,"arjun s" 写道:
Thank you for providing the details. Can it be confirmed that the Hashmap
within the accumulator stores the map in RocksDB as a binary object and
undergoes deserialization/serialization during the
Thank you for providing the details. Can it be confirmed that the Hashmap
within the accumulator stores the map in RocksDB as a binary object and
undergoes deserialization/serialization during the execution of the
aggregate function?
Thanks,
Arjun
On Mon, 4 Dec 2023 at 12:24, Xuyang wrote
Hi, Arjun.
> I'm using a HashMap to aggregate the results.
Do you means the you define a hashMap in the accumulator? If yes, I think it
restores a binary object about map in RocksDB and deserialize it like this[1].
If you are using flink sql, you can try to debug the class 'WindowOpera
Hi team,
I'm new to Flink's window and aggregate functions, and I've configured my
state backend as RocksDB. Currently, I'm computing the count of each ID
within a 10-minute duration from the data source. I'm using a HashMap to
aggregate the results. Now, I'm interested in understanding where
, operator will random access the keys in
RocksDB, enabling bloomfliter in RocksDB will help a lot in this situation.
Cons:
By enabling BloomFilter, RocksDB's compaction process will add bloom filter
information for new generated SST files. This operation executes
asynchronously in the background
e of your jobs. But make sure to
> configure your jobs so that they will be able to accommodate the potential
> memory footprint growth. Also please read the following resources to know
> more about RocksDBs bloom filter:
> https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter
&
the performance of your jobs. But make sure to configure
your jobs so that they will be able to accommodate the potential memory
footprint growth. Also please read the following resources to know more
about RocksDBs bloom filter:
https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter
https://rocksdb.org
I don’t know much about the performance improvements that may come from using
bloom filters, but I believe you can also improve RocksDB performance by
increasing managed memory
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#taskmanager-memory-managed-fraction
to
increase rocksdb performance.
Thanks
mended practice for RocksDB usage is to have local disks accessible
> to it. The Kubernetes Operator doesn’t have fields related to creating disks
> for RocksDB to use.
>
>
>
> For instance, say I have maxParallelism=10 but parallelism=1. I have a
> statically created PVC named
The recommended practice for RocksDB usage is to have local disks accessible to
it. The Kubernetes Operator doesn’t have fields related to creating disks for
RocksDB to use.
For instance, say I have maxParallelism=10 but parallelism=1. I have a
statically created PVC named “flink-rocksdb
Hi, Patrick:
We have encountered the same issue, that TaskManager's memory consumption
increases almost monotonously.
I'll try to describe what we have observed and our solution. You can check
if it would solve the problem.
We have observed that
1. Jobs with RocksDB state backend would fail
Hello,
We are running Flink jobs on K8s and using RocksDB as state backend. It is
connected to S3 for checkpointing. We have multiple states in the job (mapstate
and value states). We are seeing a slow but stable increase over time on the
memory consumption. We only see this in our jobs
Hi,
https://flink-learning.org.cn/article/detail/c1db8bc157c72069979e411cd99714fd
这篇文章中有一些关于 Flink RocksDB write buffer 和 block cache 内存计算的理论和实例讲解,可以参考下
On Fri, Sep 1, 2023 at 2:56 PM crazy <2463829...@qq.com.invalid> wrote:
> 大佬们好,
>flink1.13.5
> statebackends基于rocksdb,请问下在托管
I agree with Yaroslav, generally speaking PVs are not necessary or even
recommended for RocksDB because the state doesn't need to be shared,
recovered later anyways.
It's usually faster and cheaper to go with instance level SSDs.
Gyula
On Wed, Aug 30, 2023 at 8:37 PM Yaroslav Tkachenko
wrote
It depends on your requirements. Personally, I don't use PVs and, instead,
mount a volume from a host with a fast instance-level SSD.
On Wed, Aug 30, 2023 at 11:26 AM Tony Chen wrote:
> We used to have a Persistent Volume (PV), attached to the pod, for storing
> the RocksDB data while
We used to have a Persistent Volume (PV), attached to the pod, for storing
the RocksDB data while using the GoogleCloudPlatform operator. For the
Apache flink-kubernetes-operator, do the pods need a PV attached to it to
use RocksDB? If not, do you have recommendations on memory configuration
Hi!
Rocksdb is supported and every other state backend as well.
You can simply set this in you config like before :)
Cheers
Gyula
On Wed, 30 Aug 2023 at 19:22, Tony Chen wrote:
> Hi Flink Community,
>
> Does the flink-kubernetes-operator support RocksDB as the stat
Hey Tony,
Pretty much all Flink configuration is supported, including the RocksDB
state backend.
On Wed, Aug 30, 2023 at 9:05 AM Tony Chen wrote:
> Hi Flink Community,
>
> Does the flink-kubernetes-operator support RocksDB as the state backend
> for FlinkDeployment?
>
>
Hi Flink Community,
Does the flink-kubernetes-operator support RocksDB as the state backend for
FlinkDeployment?
We have some Flink applications that have large states, and we were able to
deal with these large states in the past with RocksDB. If there is no
support for RocksDB, are there any
??
??rocksdb??
flink??1.13.5
flink on yarn
??block_cache??write_buffer_manager
block_cache_capacity=((3 - writeBufferRatio) *
totalMemorySize / 3)
write_buffer_manager=(2 * totalMemorySize *
writeBufferRatio / 3
Hi neha,
1. You can set the path of jemalloc into LD_LIBRARY_PATH of YARN[1],
and here is a blog post about "RocksDB Memory Usage"[2].
2. The default value of cleanupInRocksdbCompactFilter is 1000[3],
maybe another value can be set according to the TPS of the job.
Hi neha,
Due to the limitation of RocksDB, we cannot create a
strict-capacity-limit LRUCache which shared among rocksDB instance(s),
FLINK-15532[1] is created to track this.
BTW, have you set TTL for this job[2], TTL can help control the state size.
[1] https://issues.apache.org/jira/browse
n increasing. I suspect it might be because of
> RocksDB.
>
> we have the default value for state.backend.rocksdb.memory.managed as
> true. Can anyone confirm that this config will Rockdb be able to take the
> unbounded native memory?
>
> If yes, what metrics can I check
Hello,
I am trying to debug the unbounded memory consumption by the Flink process.
The heap size of the process remains the same. The size of the RSS of the
process keeps on increasing. I suspect it might be because of RocksDB.
we have the default value for state.backend.rocksdb.memory.managed
ializer, as this is one of the
>> main differences between RocksDBStateBackend and HashMapStateBackend
>> (HashMapStateBackend does not perform serialization and deserialization).
>>
>> On Wed, Jun 21, 2023 at 3:44 PM Prabhu Joseph
>> wrote:
>>
>>> Hi,
>>>
>>&
and HashMapStateBackend
> (HashMapStateBackend does not perform serialization and deserialization).
>
> On Wed, Jun 21, 2023 at 3:44 PM Prabhu Joseph
> wrote:
>
>> Hi,
>>
>> RocksDB State Backend GET call on a key that was PUT into the state like
>> 100 ms earli
, as this is one of the
main differences between RocksDBStateBackend and HashMapStateBackend
(HashMapStateBackend does not perform serialization and deserialization).
On Wed, Jun 21, 2023 at 3:44 PM Prabhu Joseph
wrote:
> Hi,
>
> RocksDB State Backend GET call on a key that was PUT into the state like
Hi,
RocksDB State Backend GET call on a key that was PUT into the state like
100 ms earlier but is not returned intermittently. The issue never happened
with the HashDB State backend. We are trying to increase block cache size,
write buffer size, and enable bloom filter as per the doc: -
https
Hi, 目前对RocksDB使用的内存是没有严格限制住的,可以参考这个 ticket:
https://issues.apache.org/jira/browse/FLINK-15532
如果要定位到内存使用情况,可以先看一些粗的Metrics:
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#rocksdb-native-metrics
如果要再细致定位到单 instance 内部 RocksDB 的详细内存使用情况,可能需要用 malloc
的prof工具了,比如Jemalloc的
t; Hi, Gyula.
>> It seems related to https://issues.apache.org/jira/browse/FLINK-23346.
>> We also saw core dump while using list state after triggering state
>> migration and ttl compaction filter. Have you triggered the schema
>> evolution ?
>> It seems a bug of the rocksdb list state
also saw core dump while using list state after triggering state
> migration and ttl compaction filter. Have you triggered the schema
> evolution ?
> It seems a bug of the rocksdb list state together with ttl compaction
> filter.
>
> On Wed, May 17, 2023 at 7:05 PM Gyula Fóra w
Hi, Gyula.
It seems related to https://issues.apache.org/jira/browse/FLINK-23346.
We also saw core dump while using list state after triggering state
migration and ttl compaction filter. Have you triggered the schema
evolution ?
It seems a bug of the rocksdb list state together with ttl compaction
Hi All!
We are encountering an error on a larger stateful job (around 1 TB + state)
on restore from a rocksdb checkpoint. The taskmanagers keep crashing with a
segfault coming from the rocksdb native logic and seem to be related to the
FlinkCompactionFilter mechanism.
The gist with the full
t; > Weihua Hu 于2023年4月21日周五 19:23写道:
> >
> > > Hi,
> > >
> > > 你作业运行在 YARN 还是 Kubernetes 上?可以先关注下文档里的 Glibc 泄露问题
> > >
> > > Best,
> > > Weihua
> > >
> > >
> > > On Fri, Apr 21, 2023 at 6:04 PM Guo Thompson
> >
Hi
这是TM向JM发送消息超时了,可以了看下JM是否有错误日志,或者对应的TM和JM是否有资源打满等情况,导致akka消息超时
Best,
Shammon FY
On Sun, Apr 23, 2023 at 2:28 PM crazy <2463829...@qq.com.invalid> wrote:
> Hi, 大佬好,
>有个Flink on
> Yarn程序,Flink版本使用的是flink-1.13.5,statebackend使用的是rocksdb,任务跑一段时间,就会出现如下堆栈异常:
>
>
>
成1了
>
> Weihua Hu 于2023年4月21日周五 19:23写道:
>
> > Hi,
> >
> > 你作业运行在 YARN 还是 Kubernetes 上?可以先关注下文档里的 Glibc 泄露问题
> >
> > Best,
> > Weihua
> >
> >
> > On Fri, Apr 21, 2023 at 6:04 PM Guo Thompson
> > wrote:
> >
> > > Flink
>
yarn,我已经关闭了yarn的内存检查,glibc的那个参数已经配置成1了
Weihua Hu 于2023年4月21日周五 19:23写道:
> Hi,
>
> 你作业运行在 YARN 还是 Kubernetes 上?可以先关注下文档里的 Glibc 泄露问题
>
> Best,
> Weihua
>
>
> On Fri, Apr 21, 2023 at 6:04 PM Guo Thompson
> wrote:
>
> > Flink
> >
> Job是基于sql的,Flink版
Hi,
你作业运行在 YARN 还是 Kubernetes 上?可以先关注下文档里的 Glibc 泄露问题
Best,
Weihua
On Fri, Apr 21, 2023 at 6:04 PM Guo Thompson wrote:
> Flink
> Job是基于sql的,Flink版本为1.13.3,state用rocksDB存,发现会存在内存泄露的情况,作业运行一段时间后,会被linux内核kill掉,求助,如何解决?
> 网上
> http://www.whitewood.me/2021/01/02/%E8%AF%A6%E8%A7%A3-
Flink
Job是基于sql的,Flink版本为1.13.3,state用rocksDB存,发现会存在内存泄露的情况,作业运行一段时间后,会被linux内核kill掉,求助,如何解决?
网上
http://www.whitewood.me/2021/01/02/%E8%AF%A6%E8%A7%A3-Flink-%E5%AE%B9%E5%99%A8%E5%8C%96%E7%8E%AF%E5%A2%83%E4%B8%8B%E7%9A%84-OOM-Killed/
讲很可能就是rocksDB的内存没法回收导致。
1、分配 tm的30G内存,jvm堆内的远远没有使用完。
[image
感谢回复。我们在之前使用Flink 1.11的应用是可以支持增加带默认值的field。目前1.16的Table API无法兼容吗?
On Mon, Apr 17, 2023 at 11:21 PM Shammon FY wrote:
> Hi
>
> 目前增减列数据会导致状态无法兼容
>
> Best,
> Shammon FY
>
>
> On Fri, Apr 14, 2023 at 9:09 PM Elvis Chen
> wrote:
>
> > 我们正在使用flink-1.1
Hi
目前增减列数据会导致状态无法兼容
Best,
Shammon FY
On Fri, Apr 14, 2023 at 9:09 PM Elvis Chen
wrote:
> 我们正在使用flink-1.16.0的Table API和RocksDB作为后端,为我们的用户提供运行SQL
>
> queries的服务。表格是使用Avro模式创建的,当以兼容的方式更改模式,例如添加一个带默认值的field时,我们无法从savepoint恢复作业。这是在数据结构升级后的报错:
我们正在使用flink-1.16.0的Table API和RocksDB作为后端,为我们的用户提供运行SQL
queries的服务。表格是使用Avro模式创建的,当以兼容的方式更改模式,例如添加一个带默认值的field时,我们无法从savepoint恢复作业。这是在数据结构升级后的报错:
Caused by: org.apache.flink.util.StateMigrationException: The new state
serializer
(org.apache.flink.table.runtime.typeutils.RowDataSerializer@aad5b03a
If I store the Java protobuf objects in the rocksdb instead of the scala
objects, I get this stacktrace:
2023-02-07 09:17:04,246 WARN org.apache.flink.runtime.taskmanager.Task
[] - KeyedProcess -> (Map -> Sink: signalSink, Map -> Flat
Map -> Sink: FeatureSink, Sink: l
tore, Sink: logsink) (2/2)#0
(1befbd4d8975833fc973fc080ea866e4) switched from RUNNING to FAILED with
failure cause: org.apache.flink.util.FlinkRuntimeException: Error while
retrieving data from RocksDB.
at
org.apache.flink.contrib.streaming.state.RocksDBValueState.val
Hi,
IIUC, numRetainedCheckpoints will only influence the space overhead of
checkpoint dir, but not the incremental size.
RocksDB executes incremental checkpoint based on the shard directory which
will always remain SST Files as much as possible (maybe it's from the last
checkpoint, or maybe from
Hi,
After going through the following article regarding rocksdb incremental
checkpoint
(https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html),
my understanding was that at each checkpoint, flink only checkpoints newly
created SSTables whereas other it can reference from
>> would like to understand the impact if we make changes in our local Flink
>> code with regards to testing efforts and any other affected modules?
>>
>> Can you please clarify this?
>>
>> Thanks,
>> Vidya Sagar.
>>
>>
>> On Wed,
affected modules?
>
> Can you please clarify this?
>
> Thanks,
> Vidya Sagar.
>
>
> On Wed, Dec 7, 2022 at 7:59 AM Yanfei Lei wrote:
>
>> Hi Vidya Sagar,
>>
>> Thanks for bringing this up.
>>
>> The RocksDB state backend defaults to Snap
e base? I
would like to understand the impact if we make changes in our local Flink
code with regards to testing efforts and any other affected modules?
Can you please clarify this?
Thanks,
Vidya Sagar.
On Wed, Dec 7, 2022 at 7:59 AM Yanfei Lei wrote:
> Hi Vidya Sagar,
>
> Thanks for brin
Hi Vidya Sagar,
Thanks for bringing this up.
The RocksDB state backend defaults to Snappy[1]. If the compression option
is not specifically configured, this vulnerability of ZLIB has no effect on
the Flink application for the time being.
*> is there any plan in the coming days to addr
*How is it linked to Flink?: *
In the Flink statebackend rocksdb, there is ZLIB version 1.2.11 is used as
part of the .so file. Hence, there is vulnerability exposure here.
*Flink code details/links:*
I am seeing the latest Flink code base where the statebackend rocksdb
library *(frocksdbjni
ent implementation, because the random UUID of old
instancePath isn't recorded and we don't know which path to delete.
> *What is the general design recommendation is such cases where RocksDB
has mount path to a Volume on host node?*
For me, I usually use emptyDir[1] to sidestep the deleting p
nt scenario that is on K8s.
>>>
>>> - In K8s set up, I have Volume on the cluster node and mount path is
>>> specified for the RockDB checkpoints location. So, when the Application TM
>>> POD is restarted, the older checkpoints are read back from the host
he Application TM
>> POD is restarted, the older checkpoints are read back from the host path
>> again when the TM is UP again.
>> In this case, RocksDB local directory is pulled with all the older data
>> which is not useful for the JOB ID as the "instanceBasePath&
t; specified for the RockDB checkpoints location. So, when the Application TM
> POD is restarted, the older checkpoints are read back from the host path
> again when the TM is UP again.
> In this case, RocksDB local directory is pulled with all the older data
> which is not useful for the JOB ID
1 - 100 of 984 matches
Mail list logo