Hi everyone, I’m running Flink 2.2 + K8s Operator 1.14 with the ForSt async state backend, checkpoints in S3, working state around 1TB.
With upgradeMode last-state and using the adaptive scheduler we observe: - Downscale rescales (e.g. parallelism 24 -> 12) take roughly 2 hours from spec change to RUNNING (no events are processed in this time - Upscale rescales in the same flow are much faster, maybe 2-10 min. The relevant ForSt / checkpointing settings are: - use-ingest-db-restore-mode: true - incremental-restore-async-compact-after-rescale: true - state-recovery.claim-mode: CLAIM - NATIVE savepoint format, incremental + compressed checkpoints - pipeline.max-parallelism: 120 - ForSt cache size-based-limit 350GB on a 500Gi gp3 volume per TM. - I’m running one slot per TM and have tried parallelisms between 12 and 24 A few questions: - Is ~2h for a downscale at this state size in line with what others see, or does it suggest a misconfiguration? Most of the time appears to be in restore. - Is the asymmetry where upscales are much faster also expected? - Given the asymmetry, are there settings or patterns that make autoscaling viable at ~1TB state beyond adjusting job.autoscaler.scale-down.max-factor and scale-down.interval? Or is the practical recommendation at this scale to keep parallelism static and rescale manually during low-traffic windows? Sidenote: we also see OOMKilled containers periodically when per-TM state is large, despite managed.fraction 0.5 and kubernetes.taskmanager.memory.limit-factor 1.5. ForSt native memory appears to overshoot the managed budget. Any specific settings worth investigating before bumping pod memory? Happy to share full config and metrics if useful. Thanks, Francis
