[DISCUSS] Refactor StateBackend into Partitioned State and Non-Partitioned State Backends

Aljoscha Krettek Thu, 07 Jan 2016 02:03:08 -0800

Hi,
I’m currently examining ways to 1) change the window operators to use the 
partitioned state abstraction for window state and 2) implement state backends 
for managed memory/out-of-core state.


I think it would be helpful to pull the state backend apart. Right now, for 
example, the DbStateBackend has a custom way of specifying another state 
backend that should be used for non-partitioned state since a data base really 
only makes sense for partitioned state. I was thinking about adding a state 
backend based on RocksDB, which would also only make sense for partitioned 
state. Pulling the two ways of state apart would allow the implementation to 
focus on the important parts and give the user flexibility without requiring 
every state backend to implement this.

What do you think about pulling the back ends apart?

—
Aljoscha

P.S. I have a prototype WindowOperator on partitioned state that does not 
regress in performance compared to the current WindowOperator. Also, I have a 
prototype RocksDB state backend. Here, the performance is about 1/10th compared 
to using the in-memory state backend (with 100.000 keys) but it scales way 
better (with the in-memory state backend performance goes down when increasing 
the number of keys while it stays constant with RocksDB). This is quite nice 
since it allows to use the same windowing code while exchanging the state 
backend based on the job requirements.

[DISCUSS] Refactor StateBackend into Partitioned State and Non-Partitioned State Backends

Reply via email to