Hi Gabriele,

I'd recommend extending the existing window function whenever possible, as 
Flink will automatically cover state management for you and no need to be 
concerned with state backend details. Incremental aggregation for reduce state 
size is also out of the box if your usage can be satisfied with the 
reduce/aggregate function pattern, which is important for large windows.

Best,
Zhanghao Chen
________________________________
From: Gabriele Mencagli <gabriele.menca...@gmail.com>
Sent: Monday, March 4, 2024 19:38
To: user@flink.apache.org <user@flink.apache.org>
Subject: Question about time-based operators with RocksDB backend


Dear Flink Community,

I am using Flink with the DataStream API and operators implemented using 
RichedFunctions. I know that Flink provides a set of window-based operators 
with time-based semantics and tumbling/sliding windows.

By reading the Flink documentation, I understand that there is the possibility 
to change the memory backend utilized for storing the in-flight state of the 
operators. For example, using RocksDB for this purpose to cope with a 
larger-than-memory state. If I am not wrong, to transparently change the 
backend (e.g., from in-memory to RocksDB) we have to use a proper API to access 
the state. For example, the Keyed State API with different abstractions such as 
ValueState<T>, ListState<T>, etc... as reported 
here<https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/fault-tolerance/state/>.

My question is related to the utilization of time-based window operators with 
the RocksDB backend. Suppose for example very large temporal windows with a 
huge number of keys in the stream. I am wondering if there is a possibility to 
use the built-in window operators of Flink (e.g., with an AggregateFunction or 
a more generic ProcessWindowFunction as 
here<https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/operators/windows/>)
 transparently with RocksDB support as a state back-end, or if I have to 
develop the window operator in a raw manner using the Keyed State API (e.g., 
ListState, AggregateState) for this purpose by implementing the underlying 
window logic manually in the code of RichedFunction of the operator (e.g., a 
FlatMap).

Thanks for your support,

--
Gabriele Mencagli

Reply via email to