Sorry for the late reply.
There's not much you can do at the moment, as Flink needs to sync on the
checkpoint barriers.
There's something in the making for addressing the issue soon:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints
Did you try out using the FsSta
Yes FsStateBackend would be the best fit for state access performance in
this case. Just a reminder that FsStateBackend will upload the full dataset
to DFS during checkpointing, so please watch the network bandwidth usage
and make sure it won't become a new bottleneck.
Best Regards,
Yu
On Fri, 2
I would try the FsStateBackend in this scenario, as you have enough memory
available.
On Thu, Jan 30, 2020 at 5:26 PM Ran Zhang wrote:
> Hi Gordon,
>
> Thanks for your reply! Regarding state size - we are at 200-300gb but we
> have 120 parallelism which will make each task handle ~2 - 3 gb state
Hi Gordon,
Thanks for your reply! Regarding state size - we are at 200-300gb but we
have 120 parallelism which will make each task handle ~2 - 3 gb state.
(when we submit the job we are setting tm memory to 15g.) In this scenario
what will be the best fit for statebackend?
Thanks,
Ran
On Wed, Ja
Hi Ran,
On Thu, Jan 30, 2020 at 9:39 AM Ran Zhang wrote:
> Hi all,
>
> We have a Flink app that uses a KeyedProcessFunction, and in the function
> it requires a ValueState(of TreeSet) and the processElement method needs to
> access and update it. We tried to use RocksDB as our stateBackend but t
Hi all,
We have a Flink app that uses a KeyedProcessFunction, and in the function
it requires a ValueState(of TreeSet) and the processElement method needs to
access and update it. We tried to use RocksDB as our stateBackend but the
performance is not good, and intuitively we think it was because o