Hi Gabriele,

The keyed state APIs (ValueState、ListState、etc....) are supported by all
types of state backend (hashmap、rocksdb、etc.). And the built-in window
operators are implemented with these state APIs internally. So you can use
these built-in operators/functions with the RocksDB state backend right out
of the box [1].

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#setting-default-state-backend

Best,
Jinzhong Li


On Tue, Mar 5, 2024 at 10:59 AM Zakelly Lan <zakelly....@gmail.com> wrote:

> Hi Gabriele,
>
> Quick answer: You can use the built-in window operators which have been
> integrated with state backends including RocksDB.
>
>
> Thanks,
> Zakelly
>
> On Tue, Mar 5, 2024 at 10:33 AM Zhanghao Chen <zhanghao.c...@outlook.com>
> wrote:
>
>> Hi Gabriele,
>>
>> I'd recommend extending the existing window function whenever possible,
>> as Flink will automatically cover state management for you and no need to
>> be concerned with state backend details. Incremental aggregation for reduce
>> state size is also out of the box if your usage can be satisfied with the
>> reduce/aggregate function pattern, which is important for large windows.
>>
>> Best,
>> Zhanghao Chen
>> ------------------------------
>> *From:* Gabriele Mencagli <gabriele.menca...@gmail.com>
>> *Sent:* Monday, March 4, 2024 19:38
>> *To:* user@flink.apache.org <user@flink.apache.org>
>> *Subject:* Question about time-based operators with RocksDB backend
>>
>>
>> Dear Flink Community,
>>
>> I am using Flink with the DataStream API and operators implemented using
>> RichedFunctions. I know that Flink provides a set of window-based operators
>> with time-based semantics and tumbling/sliding windows.
>>
>> By reading the Flink documentation, I understand that there is the
>> possibility to change the memory backend utilized for storing the in-flight
>> state of the operators. For example, using RocksDB for this purpose to cope
>> with a larger-than-memory state. If I am not wrong, to transparently change
>> the backend (e.g., from in-memory to RocksDB) we have to use a proper API
>> to access the state. For example, the Keyed State API with different
>> abstractions such as ValueState<T>, ListState<T>, etc... as reported here
>> <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/>
>> .
>>
>> My question is related to the utilization of time-based window operators
>> with the RocksDB backend. Suppose for example very large temporal windows
>> with a huge number of keys in the stream. I am wondering if there is a
>> possibility to use the built-in window operators of Flink (e.g., with an
>> AggregateFunction or a more generic ProcessWindowFunction as here
>> <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/operators/windows/>)
>> transparently with RocksDB support as a state back-end, or if I have to
>> develop the window operator in a raw manner using the Keyed State API
>> (e.g., ListState, AggregateState) for this purpose by implementing the
>> underlying window logic manually in the code of RichedFunction of the
>> operator (e.g., a FlatMap).
>> Thanks for your support,
>>
>> --
>> Gabriele Mencagli
>>
>>

Reply via email to