I agree I might be too quick to call DoFn output need to fit in memory. Actually I am not sure what Beam model say on this matter and what output managers of particular runners do about it.
But SparkRunner definitely has an issue here. I did try set small `fetchSize` for JdbcIO as well as change `storageLevel` to MEMORY_AND_DISK. All fails on OOM. When looking at the heap, most of it is used by linked list multi-map of DoFnOutputManager here: https://github.com/apache/beam/blob/v2.15.0/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/MultiDoFnFunction.java#L234